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[57] ABSTRACT 

A data packet switch, in general, and an Asynchronous 
Transfer Mode switch, in particular employing a plurality of 
physically separate memory modules operates like a single 
shared memory switch by allowing sharing of all of the 
memory modules among all of the inputs and outputs of the 
switch. The disclosed switching apparatus consists of mul- 
tiple independent stages where different stages of the switch 
operate without a common centralized controller. The dis- 
closed switch removes performance bottleneck commonly 
caused by use of a centralized controller in the switching 
system. Incoming data packets are assigned routing param- 
eters by a parameter assignment circuit based on packets' 
output destination and current state of the switching system. 
The routing parameters are then attached as an additional tag 
to input packets for their propagation through various stages 
of the switching apparatus. Packets with the attached routing 
parameters pass through different stages of the switching 
apparatus and the corresponding switching functions are 
locally performed by each stage based only on the informa- 
tion available locally. Memory modules along with their 
controllers use information available locally to perform 
memory operations and related memory management to 
realize overall switching function. The switching apparatus 
and the method facilitates sharing of physically separate 
memory modules without using a centralized memory con- 
troller. The switching apparatus and the method provide 
higher scalability, simplified circuit design, pipeline pro- 
cessing of data packets and the ability to realize various 
memory sharing schemes for a plurality memory modules in 
the switch. 

14 Claims, 56 Drawing Sheets 
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Fig. 4: Traversal of Sliding- Window in global 
buffer space 
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Fig 5: Assignment of Self-Routing 
Parameters (i,j,k) 
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Fig. 15-3: Pipeline Stage A(3,3) : WRITE 

WRITE ATM cells in the location of i th memory module; Set OSA(j)=k 
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Fig. 16-3 : Pipeline Stage B(3,4) : WRITE 

WRITE ATM cells in the j th location of i lh memory module; Set OSA0)=k 
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Fig. 16-4: Pipeline Stage R(4,4) :READ 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=SW.sp; 
Set OSA(SW.osv)=0 for Read cells. 
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Fig. 17-3 : Pipeline Stage C(3,5) : WRITE 

WRITE ATM cells in the j* location of i* memory module; Set OSAG)=k 
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Fig. 17-4 : Pipeline Stage R(4,5) :READ 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=S W.sp; 
Set OSA(SW.osv^O for Read cells. 
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Fig. 18-3: Pipeline Stage D(3,6) : WRITE 

WRITE ATM cells in the j th location of I th memory module; Set OSAG)=k 
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Fig. 18-4: Pipeline Stage R(4,6) :READ 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=SW.sp; 
Set OSA(SW.osv)=0 for Read cells. 




12/08/2003, EAST Version: 1.4.1 



U.S. Patent Sep. 19, 2000 Sheet 32 of 56 



6,122,274 



4> 



P 

N & 

o 

O i 

CL o 
ft* « 

. . z 

m J 

L 

00 8 
. o 

• 1— ( £ 

3 
& 

a 
O 



2 w 

t.i 

o - 



uoi^uuojui /? uo pos^q si ssmj ;ndjno oj s||so jo 8ui}no^[ 

^7X9 = jxui ; >[joau9js[ uoipsuuooiajui jnd;no 



11 



X 



> 



o "3 

U O 



12/08/2003, EAST Version: 1.4.1 



U.S. Patent 



Sep. 19, 2000 



Sheet 33 of 56 



6,122,274 



i— i N 

a a 

+-» <> 

c A 

•H • • 



OS c 



so 



u 
© 

CM 

2 
a. 



il 

|5 



00 



H 



2 



J! 















- 




















— 












































t° 


f c 


o 




o 


o 


o 












o 


o 








o 


o 


o 






o 


o 




IS 






o 


o 











T 



II 



II 



© 
© 



O 

CM 

V3 
O 
O 

o 



so 



© 

so 



u 



8] 

V) 



7=" 



i 



— fM rO 



7^ 



o 

t 

o 













3 




Circ 




itput 




i 











.v.v.ar.-.27.Y.\v 

- oc-o 4 

'= '^'13 
."b.'S- g-'g.'.'-'-' 

"■'M-'ia'S-"?"'"*'-'* 

•/S'.^.S.-.iiy.y. 

.•.r.-^.'avfe-.'H-.' 

.•:&'.Q.'gYP.*,>V 

•.•s-.'a-.a/FT.w. 

:|::s?:|:as:- 

.\2.\&.-s\\y.y.- 



- £ - ri 

" * " ? 

■* * n 

:z?-s::?-*- 
w 0 w :;? 

>x 7 " 
e 2 

^ ^ A A 
A A A A 



■■■a'.s- v -'l-'ff-' 



12/08/2003, EAST Version: 1.4.1 



U.S. Patent 



Sep. 19, 2000 



Sheet 34 of 56 



6,122,274 



3 
-a 
o 

B 



b 3 

o ^ 

a -5 

<D O 



so a 

1/3 

S3 



60 

w 

O 

o 
2> 



<D 5 

Pi 



! 



OS z 
fi>0 8 

E I 



c 



iS- 



ispuiBJBd M j i uo paseq si sjjqo jo Suijno'H 
9x-t7=uixu : >[jom;9|v[ uoipsuuooasjui indu\ 



n 

3 



> 

(I 

::? 



u g 

u '55 



C/5 



12/08/2003, EAST Version: 1.4.1 



U.S. Patent Sep. 19, 2000 Sheet 35 of 56 6,122,274 



Fig. 19-3 : Pipeline Stage E(3,7) : WRITE 

WRITE ATM cells in the j ,h location of i* memory module; Set OSA(j)=k 
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Fig. 19-4 : Pipeline Stage R(4,7) :READ 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=SW.sp; 
Set OSA(SW,osv)=0 for Read cells. 
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Fig.20-3 : Pipeline Stage F(3,8) : WRITE 

WRITE ATM cells in the j th location of i th memory module; Set OSA(j)=k 
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Fig.20-4 : Pipeline Stage R(4,8) :READ 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=SW.sp; 
Set OSAfS W,osv)=0 for Read cells. 




To 



12/08/2003, EAST Version: 1.4.1 



U.S. Patent 



Sep. 19, 2000 



Sheet 42 of 56 



6,122,274 



t/5 

C 



B 

OO 2 

CN O 

IT) T3 

J; V, n> 

O s 

to 8 
S I 

§ 



in 
i 

O 



00 



G 

o 

e 
e 
o 
o 



f 



t J 

o - 



J 


i i 


L 

ro 


▲ i 




uoiieuuojui p uo pas^q si saiijj ;nd}no oj s||90 jo Suiino^i 




i 4 


i i 

II 




i 

€ 

N 


IT) 





N 



^3 

(D O 

s s 



12/08/2003, EAST Version: 1.4.1 



U.S. Patent Sep. 19, 2000 Sheet 43 of 56 6,122,274 



to 

i 

s 

^ I 
£S 1 

HH -u 
HH ° 

(D .!> 

CO _« 
d> IS 

C ° 

^ | 

Oh* 

P-l 5 

• • z 

__l 4> 

i z 

fcfl g 

E| 

^— > 

I 



o c 



J 

<H 

w 

■* 


ri 


ri 
vn 


N 






▲ 


aapurared / uo pssBq si sjpo jo §m;no^ 
9Xf7=uixu : >[jompjs[ uoTp9iiuooj9jui ;ndui 


_ * 

ri 
en 

go 

II 

-a 

—5 


ri 

n 

<3 

J* 


L J 

»o 
ri 

00 

> 

II 


L iL 

ri 

VO 

t£ 

£3 

ii 





u 

X Q 



OX) 



1 

u 



12/08/2003, EAST Version: 1.4.1 



U.S. Patent Sep. 19, 2000 Sheet 44 of 56 6,122,274 



Fig.21-2 : Pipeline Stage G(3,9) : WRITE 

WRITE ATM cells in the j* location of i* memory module; Set OSA(j)=k 
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Fig.21-3 : Pipeline Stage R(4,9) :READ 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=SW.sp; 
Set OSA(SW.osv)=0 for Read cells, 
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Fig.22-1: Pipeline Stage H(3,10) : WRITE 

WRITE ATM cells in the j* location of i ,h memory module; Set OSA0)=k 
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Fig.22-2 : Pipeline Stage R(4, 10) :READ 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=SW.sp; 
Set OSA(SW.osv)=0 for Read cells. 
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Fig. 23-1 : Pipeline Stage R(4,l 1) :READ ATM Cells 

READ ATM cells from memory location (SW.osv) if OSA(SW.osv)=SW.sp; 
Set OSA(SW,osv)=0 for Read cells. 
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Fig.26: Occupancy of memory space in the example 
4x4 switch for 12 cycles of cell arrivals. 
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Fig. 28: Occupancy of memory space in the example 4x4 
switch for 16 pipeline cycles of cell arrivals. Control of a 
queue inside the shared space is shown for an unbalanced 

traffic. 
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ATM SWITCHING SYSTEM WITH 
DECENTRALIZED PIPELINE CONTROL 
AND PLURAL MEMORY MODULES FOR 
VERY HIGH CAPACITY DATA SWITCHING 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

Not applicable 

STATEMENT REGARDING FEDERALLY 
SPONSORED RESEARCH AND 
DEVELOPMENT 

Not applicable 

REFERENCE TO A MICROFICHE APPENDIX 
Not applicable 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

This invention relates to high capacity packet switching 
apparatus, in general and Asynchronous Transfer Mode 
(ATM) cell switching apparatus, in particular which is 
typically used for high speed multimedia networking com- 
munications. More particularly, this invention is directed 
towards decentralized and pipeline control based ATM 
switching apparatus and method to enable high capacity 
switching. 

2. Prior Art 

Besides its best possible delay-throughput performance, 
ATM switching systems employing shared buffers have also 
been known in the art to incur the lowest cell-loss rate 
compared to that of the ATM switches employing input or 
output buffering strategies. However, a typical design of a 
large shared-buffer based ATM switching system has been 
severely restricted by the bottleneck created by high 
memory bandwidth requirements, segregation of the buffer 
space and centralized buffer control bottleneck which causes 
the switch performance to degrade as the switch grows in 
size. In order to preserve its ability to provide for the low 
cell-loss rate for a given buffer size, an ATM switching 
network design should attempt to provide for global buffer 
sharing among all its inputs and output lines, provide 
memory sharing schemes to allow fair sharing of a common 
memory space under different traffic type and alleviate 
performance bottleneck caused by centralized control. 

A traditional approach to design a large size shared-buffer 
based ATM switching systems has been to first design a 
feasible size shared-buffer ATM switching modules and then 
interconnect plurality of such modules in some fashion to 
build a large size switching system. Some of the previously 
used methods and schemes to build large size shared-buffer 
based ATM switch can be categorized as follows: 

The Multistage Interconnection Network (MIN) 
approach: According to this general scheme, a multistage 
interconnection network is used to build a large size shared- 
buffer based switching system with a small size, shared- 
buffer switching elements deployed at each node of the 
interconnection network [SAKURAI Y, et al, "Large-Scale 
ATM Multistage Switching Network with Shared Buffer 
Memory Switches," IEEE Communication, January 1991.]. 
This general scheme of switch growth is known to cause 
degradation in performance of a shared -buffer architecture 
as the switch grows in size. Degradation in cell-loss and 
throughput performance result mainly from internal link 
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conflicts, output blocking and incomplete buffer sharing due 
to separation of memory space among plurality of modules. 
Furthermore, it is obvious that this approach does not allow 
global sharing of the employed buffer space among all of its 
s input-output ports. Because of separation of buffer space, 
not all output lines can share the entire buffer space of the 
switch. Under unbalanced traffic it is possible for some 
switch buffers to overflow while other switch buffers being 
under utilized. 

10 Growable switch approach [ENG K. Y. et al, "A Growable 
Packet (ATM) Switch Architecture: Design, Principles and 
Applications," IEEE Transactions on Communications, Feb- 
ruary 1992]: Unlike the Multistage Interconnection network 
approach mentioned above, in growable switch approach, a 

15 plurality of shared-buffer based switches are organized in a 
single stage preceded by a bufferless [Nx(m/n)N] 
interconnection network. Although this approach succeeds 
in providing an improved overall performance, compared to 
the general MIN approach, it does not allow global sharing 

20 of memory space among all its inputs and outputs. It is 
known in the art that this scheme does not provide best 
buffer-utilization as it is possible for a buffer belonging to a 
group of output ports to overflow under unbalanced or 
bursty traffic conditions while other buffers belonging to 

25 other output ports being empty. 

The Multiple Shared Memory (MSM) approach [WEI 
S.X. et al, "On the Multiple Memory Module Approach to 
ATM Switching," IEEE INFOCOM, 1992]: Unlike the 
previous two approaches mentioned above, this approach 

30 allows for the global sharing of the employed buffer space. 
However, MSM switch approach employ centralized control 
of the switching system consisting of plurality of memory 
modules. Use of centralized control can become a perfor- 
mance bottleneck if the switch grows in size. Furthermore, 

35 in MSM switch approach, the conditions for the best pos- 
sible delay- throughput performance has been derived under 
the assumption of infinite buffer space in the switching 
system. In reality, a buffer space tends to be finite and a 
realistic switching algorithm must accommodate for the 

40 constraints imposed by the finiteness of the buffer space in 
an ATM switching system. A finite buffer space results into 
cell-loss, and in the absence of an appropriate buffer sharing 
scheme, it results into performance degradation [KAMOUN 
F. and KLEINROCK L, "Analysis of Shared Finite Storage 

45 in a Computer Network Node Environment Under General 
Traffic Conditions," IEEE Transactions on Communications, 
July 1980]. A switching scheme which provides for a global 
sharing of the buffer space may not necessarily provide for 
best possible delay-throughput performance if the shared- 

50 buffer space tends to be finite. In order to provide for best 
possible performance with a finite common buffer space, a 
switching scheme should also be able to enforce various 
buffer sharing schemes to provide fair sharing of finite buffer 
space under various traffic types. 

55 In [OSHIMA et al., "A New ATM Switch Architecture 
based on STS-Type Shared Buffering and Its 
Implementation," ISS 1992], the proposed shared multi- 
buffer (SMB) based ATM switch design also provide a 
complete sharing of memory space among all its input and 

60 output ports. The shared multibuffer based ATM switch is 
also disclosed in recently assigned U.S. Pat. No. 5,649,217 
to Yamanaka et al. The shared multibuffer switch of 
Yamanaka et al., schematically shown in FIG. 1, uses a 
centralized controller to centrally control and manage a 

65 plurality of buffers and its write and read operations for each 
incoming and outgoing cells, centrally manage and update a 
plurality of address queues for each incoming and outgoing 
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cells, centrally provide instructions to incoming and outgo- 
ing spatial switch on how to provide routing of ATM cells 
corresponding to each of the input and output lines, and 
centrally coordinate the operation of its various components 
to realize overall switching function of the switching appa- 
ratus. The disadvantage of this approach is that the use of 
centralized controller can become a performance bottleneck 
as the switch grows in size (i.e. the input and output lines 
increase in number and/or speed). Growth in the size of the 
switch and hence the number of input and output lines would 
require the centralized controller to perform increased num- 
ber of tasks (such as write and read operations for ATM cells, 
storage and management of information in address queues in 
the central controller) for increased number of memory 
modules and input/output lines in a fixed switching time- 
slot. Similarly, as the switch grows in size, the central 
controller will need to provide increased number of routing 
instructions to incoming line spatial switch and outgoing 
line spatial switch for increased number of input and output 
lines in a fixed switching time-slot. Overall, the centralized 
controller will have to do increased number of all centralized 
control functions and memory operations described therein, 
in a fixed switching time-slot (which is usually smaller than 
the interarrival time of two consecutive cells). It is obvious 
that the centralized controller used by Yamanaka et al., as 
disclosed in U.S. Pat. No. 5,649,217 can easily become a 
bottleneck to the switch performance as the switch grows in 
size or switching capacity. 

BRIEF SUMMARY OF THE INVENTION 

The above mentioned problems, and in particular the 
bottleneck problem caused by the use of centralized con- 
troller (as described in the disclosed invention U.S. Pat. No. 
5,649,217) are removed by the switching method and appa- 
ratus of the disclosed invention. The disclosed switching 
method and the apparatus (i) alleviate the need for a cen- 
tralized buffer controller and hence remove the performance 
bottleneck resulting from the use of a centralized controller, 
(ii) provide a way to partition overall switching function in 
to multiple independent switching operations such that the 
independent operations can be performed in parallel, (hi) 
partition the switching apparatus in multiple independent 
stages with each stage running one of the above mentioned 
independent switching operation, (iv) operate multiple inde- 
pendent stages in a pipeline fashion in order to enhance 
parallelism while processing the incoming ATM cells for 
switching purposes, (v) provide decentralized control such 
that multiple independent stages perform their switching 
operation based on the information available locally and 
they do not have to depend on any central controller to 
provide centrally updated global variables, switching or 
buffer management related instructions, (vi) facilitate an 
efficient sharing of a finite buffer space among all the switch 
inputs and outputs (vii) provide various memory sharing 
schemes to allow for fair sharing of a common memory 
space under various traffic types. 

A switching method is also disclosed according to which 
the entire memory space of the switching apparatus is 
depicted as multidimensional globally shared buffer space. 
The coordinates of the space help identify a proper location 
for incoming cells in the global buffer space so that they can 
be switched with best possible delay throughput perfor- 
mance. According to this method, each incoming cells are 
assigned a self-routing parameters in the form of an addi- 
tional self- routing tag for their self propagation through 
various pipeline stages of the switching apparatus. As the 
ATM cells pass through different stages of the switching 
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apparatus, the corresponding switching functions are locally 
performed by each stage on the received ATM cells. Each 
stage uses the value of the self-routing parameters in the 
received cells while performing its local switching opera- 

5 tions. Because of the pipeline processing of ATM cells, the 
switching capacity of the system is enhanced considerably. 

Memory modules and resulting global buffer space are not 
controlled and managed by any centralized buffer controller. 
Each memory modules are independent and use their local 

10 memory controllers to perform WRITE and READ opera- 
tions for the received ATM cells and also perform related 
memory management. The local memory controllers work 
independently of each other and still help manage and 
control the globally shared buffer space of the switching 

15 apparatus. For write operation, local memory controllers use 
the self -routing parameters of received cells to determine the 
write address for the cells and write them to respective 
location in their memory modules. For read operation, the 
local memory controllers use the disclosed switching 

20 method to generate their read addresses to read cells from 
their memory modules. 

The disclosed switching system facilitate an efficient 
sharing of a finite buffer space among all the switch inputs 
and outputs. The proposed switching system can provide 

25 complete buffer sharing, partial buffer sharing and complete 
partitioning of the entire buffer space employed in the 
system. Because of its ability to operate in a decentralized 
pipeline fashion the disclosed switching method can be used 
to design a large size shared buffer based ATM switching 

30 system. Because of its ability to realize various buffer 
sharing schemes, the disclosed switching method and appa- 
ratus can be designed for high throughput performance 
under various traffic types. 

35 BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a schematic diagram showing the ATM switch- 
ing architecture having centralized controller and plural 
buffer memories disclosed in U.S. Pat. No, 5,649,217; 

FIG. 2 is a schematic diagram showing the ATM switch- 
40 ing architecture having decentralized pipeline control and 
plural buffer memories according to a preferred embodiment 
of the present invention; 

FIG. 3 is an illustration of multidimensional global buffer 
space which includes all the ATM cell memory locations in 
45 all the memory modules employed by the switching system, 
according the present invention; 

FIG. 4 is a flow diagram of the portion of the disclosed 
method that provide underlying switching functions for the 
switching apparatus, according to this invention; 

FIG. 5 illustrates a flow diagram of the portion of the 
disclosed method that computes and assigns self-routing 
parameters to the incoming ATM cells in the switching 
apparatus, according to this invention; 
5S FIG. 6 illustrates a block diagram of the self -routing 
parameter assignment circuit using the self-routing param- 
eter assignment method; 

FIG. 7 is a block diagram showing the components of the 
memory controller using the disclosed switching method, 
60 according to the present invention; 

FIG. 8 illustrates flow diagrams for memory write and 
memory read operations performed each cycle by the 
memory controller of FIG. 7; 

FIG. 9 shows the time chart for the decentralized pipeline 
65 operation of the various stages of the switching system; 

FIG. 10 shows an instance of eight cycles of incoming 
cells input to an exemplary 4x4 switching apparatus, accord- 
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ing to the disclosed switching system and method in the FIG. 24 illustrates the switching functions performed for 

present invention; the received cells in the twelfth pipeline cycle by the fifth 

FIG. 11 shows an schematic diagram of a 4x4 ATM pipeline stage of the switching apparatus according to an 

switching apparatus employing decentralized pipeline con- exemplary embodiment of the present invention; 

trol based switching method disclosed according to ao 5 FIG. 25 shows input and output time relation for the 

exemplary embodiment of the present invention; previous stream of cell arrivals for 8 pipeline cycles and the 

FIG. 12 illustrates decentralized pipeline operation of switch operation for up to 22 pipeline cycles until all the 

multiple stages while performing the switching operation on cells resident in the global buffer space are output. The 

eight cycles of incoming ATM ceils, according to the pre- update process for the sliding-window counter belonging to 

ferred embodiment of the present invention; 10 toe read stage is also shown. 

FIG. 13 illustrates the content of various counters and FIG. 26 illustrates the occupancy of multidimensional 

tables after the switching functions performed by the first global buffer space for cells input to the switching apparatus 

pipeline stage in the first pipeline cycle of the 4x4 ATM for 12 consecutive cycles, according to disclosed sliding- 

switching apparatus according to exemplary embodiment of window switching method of the present invention; 

the present invention, FIG. 27 shows input and output time relation for a stream 

FIGS. 14-1, 14-2 illustrate the switching functions per- of incoming cells for 12 pipeline cycles and the status of the 

formed for the received cells in the second pipeline cycle by sliding-window counter in the read stage, according to the 

the first and second pipeline stages of the switching appa- exemplary embodiment of the switching apparatus and 

ratus according to an exemplary embodiment of the present 2Q method of the present invention; and 

invention; FIG. 28 illustrates operation of the switch under an 

FIGS. 15-1, 15-2 and 15-3 illustrate the switching func- unbalanced traffic where a multiple stream of incoming cells 

tions performed for the received cells in the third pipeline are destined to one particular output port. Under such traffic 

cycle by the first, second and third pipeline stages of the conditions, the process of queue control inside the globally 

switching apparatus according to an exemplary embodiment 2 s shared buffer space is shown. The occupancy of multidi- 

of the present invention; mensional global buffer space and an instance of cell discard 

FIGS. 16-1, 16-2, 16-3 and 16-4 illustrate the switching for cells input to the switching apparatus for 16 consecutive 

functions performed for the received cells in the fourth cycles, according to the switching method of the disclosed 

pipeline cycle by the first, second, third and fourth pipeline invention, is also shown. 

stages of the switching apparatus according to an exemplary 30 DETAILED DESCRIPTION OF THE 

^r^^r^\ m ^\ 1,« -n - PREFERRED EMBODIMENTS 
FIGS. 17-1, 17-2, 17-3, 17-4 and 17-5 illustrate the 

switching functions performed for the received cells in the Referring now in specific detail to the drawings, with 

fifth pipeline cycle by the first, second, third, fourth and fifth reference numerals identifying similar or identical elements 

pipeline stages of the switching apparatus according to an 35 the preferred embodiment of the present invention will be 

exemplary embodiment of the present invention; described. FIG. 2 shows the overall architecture of the ATM 

FIGS. 18-1, 18-2, 18-3, 18-4 and 18-5 illustrate the switching system as an example of the packet switching 

switching functions performed for the received cells in the apparatus employing decentralized pipeline control of 

sixth pipeline cycle by the first, second, third, fourth and memory and switching functions according to this invention, 

fifth pipeline stages of the switching apparatus according to 40 In FIG 2 > the m P ut lines *** denoted by 1,, 1 2 , 1„ and 

an exemplary embodiment of the present invention; the 0Ut P ut lmes are denoted 2 lf 2 2 , . . . 2 r . Input lines carry 

FIGS. 19-1, 19-2, 19-3, 19-4 and 19-5 illustrate the the incomi^ ATO "?* a u nd the ^1 ^ ^ 

switching functions performed for the received cells in the ° ut S oin e A ™ ™ l ^ T be /° g * J 6 ? 

seventh pipeline cycle by the first, second, third, fourth and desUnaUon by the ATM switching system of FIG. 2. In this 

fifth pipeline stages of the switching apparatus according to 45 swltd » n S system no central buffer controller js used to 

an exemplary embodiment of the present invention; centrall y st0 ' e Presses of cell headers m the address 

r-^o ™ * ™„ ™ A . .u queues or to keep track of all the read and write operations 

FIGS. 20-1, 20-2, 20-3, 20-4 and 20-5 illustrate the ? „ K , , . . r 

.7. - \ ' " j — . * * n , . for all the memory modules or to coordinate corresponding 

switching functions performed for the received cells in the buffef man m opcra tions or to provide related control 

eighth pipeline cycle by the first, second, third, fourth and 5Q instructions t0 ^crcni components of the switching appa- 

fifth pipeline stages of the switching apparatus according to ^ ^ ^ m ^ a decentralized 

an exemplary embodiment of the present invention; controJ according to which each incoming ^ ceUs are 

FIGS. 21-1, 21-2, 21-3 and 21-4 illustrate the switching ass i gne d a self-routing tag. The self-routing tags allow the 

functions performed for the received cells in the ninth ATM cells to independently (means not under the instruction 

pipeline cycle by the second, third, fourth and fifth pipeline 55 of a centra , controller ) pr0C eed through the different stages 

stages of the switching apparatus according to an exemplary of tne switching apparatus and enable various switching 

embodiment of the present invention; functions to take place at different stages based on the 

FIGS. 22-1, 22-2 and 22-3 illustrate the switching func- information stored in the self-routing tags of the cells. The 

tions performed for the received cells in the tenth pipeline incoming cells are processed by header processing circuits 

cycle by the third, fourth and fifth pipeline stages of the 60 io i( 10 2 , . . . 10„ for extraction of the output line destination 

switching apparatus according to an exemplary embodiment address denoted by d. The destination address of incoming 

of the present invention; cells are forwarded to a self-routing parameter assignment 

FIGS. 23-1 and 23-2 illustrate the switching functions circuit 14. The self-routing parameter assignment circuit 14 

performed for the received cells in the eleventh pipeline uses the output destination information d and a parameter 

cycle by the fourth and fifth pipeline stages of the switching 65 assignment method to provide a set of self-routing param- 

apparatus according to an exemplary embodiment of the elers (i,j,k) to each incoming ATM cells. The self -routing 

present invention; parameters (ij,k) which are obtained by the self-routing 
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parameter assignment circuit 14, are then attached as a ATM cell memory locations in all of the memory modules 

self-routing tag to the incoming ATM cells by the header 40^ 40 2 , . . . 40 m , is represented as a three dimensional space 

processing circuits 10 ls 10 2 , . . . 10„. Hereinafter, each (jj^) and is shown in FIG. 3. The total buffer space of the 

incoming cells use the attached self-routing tag (ij*) to switching system of FIG. 2 is also called shared buffer space 

T^l i™ r ^ ? * ? gh IrTiT t 5 or gl°°al buffer space as multiple input and output lines can 

closed ATM switching apparatus of FIG. 2. The parameter i , 6 r i / u i - * e *u 

in a cell's self-routing tag designate the memory module that have acccss Jf> ™*™y l° catl0ns bdonging to any of the 

the cell will be stored in; the parameter j in a cell's memorv modules 40 p 40 2 , . . . 40 m employed in the 

self-routing tag designate the memory -location in the i* exemplary embodiment of switching apparatus of FIG. 2. In 

memory module that the cell will be stored to; the parameter the sliding-window switching method, the ATM memory 

k in the self- routing tag designate an additional parameter 30 locations in the global buffer space are represented by state 

called the scan plane which help decide when a given ATM (i where 

cell is to be read out of the memory for output purposes. The I th coordinate represent the memory-module; i=[l . . . m], 

input interconnection network 20 uses the parameter i of the where m is the number of memory modules 40 a , 40 2 , . . . 

routing tag of an incoming ATM cell to route the cell on a 40 m employed in the switching apparatus; 

given input line to its i** output line which in turn is 15 \ Ih coordinate represent the output-slot vector (osv); j-[l . . 

connected to the respective memory module. Input lines . a ] 7 where a is the number of ATM cell memory locations 

of the interconnection network 20 connects itself to the m me me mory modules; 

header processing circuits 10 1( 10 2 , 10 m while the output k * coordimle repr esent the scan-plane (sp) value; k=[l . . . 

lines of the interconnection network 20 connects itself with -. wherc ^ ^ M aQ bound ^t designate the 

the memory modules of the switching apparatus. Input 2Q Qumbef of ti compared to the scan i ength CTf that an 

modules 30,, 30 2 , . . . 30 are used corresponding to each , & fc fa cm {Q 

one of the memory modules 40,, 40 . . . 40 THe input * ^ * ^ e * h ^ 

modules 30,. 30,, .. . 30 m can be used for multiple purposes „ , iL , ' j . j • * ji_ *e ie 

however, the primary purpose of the input modules 30^30,, f ^Ued the scan plane and is designated by 15 15 2> . . 

. . . 30 m is to hold a received cell for a predetermined time } V fach scan plane as divided into a output-slot-vectors 

period before being stored in the respective memory mod- * ( osv f)- Ewh °SV consists of a m number of consecutive 

ules. Another function of modules 30,, 30,, . . . 30 m is to slots (also called memory slot), where m is the number of 

hold a received ATM cell and provide the parameters j and memory modules 40„ 40 2 , . . 40 m employed in the system, 

k information from the cell's self -routing tag to memory The output-slot vector (OSV)j represent a group of j ATM 

controllers 50,, 50 2 , . . . 50 m . The memory controllers use cell memory locations in the m number of employed 

the parameter j to write the received ATM cell in the j A 30 memory modules. The sliding-window 18 (shown in FIG. 3) 

memory-location of the corresponding memory modules is a pointer to a group of cells forming the output-slot 

40,, 40 2 , . . , 40 m . Corresponding to each memory controller vectors (OSV) in the memory space and it advances by one 

50,, 50 2 , . . . 50 m there is one Output Scan Array (OS A) each OSV upon completion of every switch cycle on a given scan 

with 0 locations. The j'* location of the Output Scan Array plane. Input and output of ATM cells take place with respect 

(OSA) holds the scan value of a received ATM cell stored in 35 to the current location of the sliding-window and the last cell 

the corresponding j** location of its memory module. OSA admitted to the multidimensional global buffer space. The 

of each memory controller is updated at the time of Write i oca tion of the sliding window (SW) 18 in the global buffer 

and Read of ATM cells to and from the respective locations space fe descr i bed by two variables indicated by (i) SW.osv 

in the memory modules. During the Write cycle of an (interchangeably used with SW.j) and (ii) SW.sp 

incoming cell to j memory location in a given memory 4Q (interchangeable used with SW .k). For example, in FIG. 3, 

module i, the scan-plane value (k) of the received cell is r ' 5 . , 1C . - nf ' in iUo aJ v : c an A 

. . ' j » , tH , v v • *u * * o the sliding-window 18 is a pointer to the OSV-j-5 and is 

stored in the corresponding j location in the Output Scan , . & , r , m u - * n 

Array (OSA) of the corresponding memory controller. Dur- lrav k ersm g on ^ SeC ™ d SCan P Une . 15 ? ha ™ § k=2 " F ° f 
ing the Read cycle of a cell from the j fA location of a memory s » c , h a ^^^t^^™^* above example 
module, the corresponding j" 1 location in the Output Scan SW.osv~SW.j-5 and SW.sp=SW.k«2 The symbol osv and 
Array (OSA) is set to 0 to indicate empty memory-location 45 V dcnotc out P ut slot vector and ^ m P lanc and are 
in the corresponding memory module. During the ATM cell interchangeably used, in this description, with j and k 
read cycle, the ATM cells are output from parallel and variables respectively. The variable SW.sp (which is inter- 
independent memory modules 40, , 40 2 , . . . 40 m and are changeably used with SW.k) holds an integer value which is 
finally routed to respective output destinations 2,, 2 2 , . . . 2 r incremented by one on the completion of sliding- window's 
by the output interconnection network 60. The output inter- 50 traversal on each scan-plane. Similarly, the variable SW.osv 
connection network 60 makes use of the output port desti- (which is interchangeably used with SW.j) holds an integer 
nation information d stored in a cell's header to route each value which is incremented by one on the completion of 
cell to final output destination 2, , 2 2 , .. ,2 r . In the exemplary sliding-window 1 s traversal of a given output slot vector 
embodiment of the disclosed ATM switching apparatus of (OSV). To keep the SW.sp and SW.osv variables from 
FIG. 2, the final output fine destination information 'd' can 55 becoming unbounded, the modulus of the scan-plane van- 
also be seen as a part of the routing tag, with the difference ab j e y^fa a predetermined upper bound value (p) of the 
that instead of residing in the routing tag, the destination sca n-plane and the modulus of the OSV variable with a 
information <d' resides in the header of each incoming cells. prcdct ermined upper bound value a of the output slot vector 
The ATM switching apparatus of the disclosed invention /q SV j is taken j^is Sliding -Window 18 of FIG. 3 traverses 
makes use of a new switching method called the Sliding- 6Q thc cntirc , oba] buffcr b traversing a output slot 
Window ATM switching method. The following section vectors (0 SVs) on all of the employed scan-planes 15,, 15 2 , 
describe the under ying switching Unctions of the disclosed n ^ fashion Fof an incom ^ 
invention of the Shding-Window ATM switching method. tQ ^ ^ d of ^ switching ^ 

THE SLIDING-WINDOW ATM SWITCHING assignment of a memory-slot (i) of an OS V(i) on a scan- 

METHOD 65 (k) is dependent on the length of its output queue, Q d 

According to the disclosed Sliding- Window ATM switch- in the global buffer space and on the current location of the 

ing method, the entire buffer space which includes all the sliding-window 18. The successive cells of an output queue 
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Q d occupy successive OSVs with one of its cells in one OSV the control flow loop back to step 404. Upon completion of 
on a given scan -plane. When the queue Q d exceeds the 0 th the switching functions in step 404, the system again waits 
OSV on one scan-plane, it starts acquiring slots of the OSVs for a new cycle in step 406. The underlying switching 
of the next scan plane. Thus an output queue can grow up to function of the sliding- window method at step 404 is that 
a length of a on a given scan plane, a, the number of OSVs s during the input phase of each switch cycle, incoming ATM 
on one scan plane is also called the scan-length of the cells are assigned memory locations within the global buffer 
employed global buffer space and is equal to number of space with the help of self -routing parameter assignment 
ATM cell locations in a given memory module. The number circuit 14 and during the output phase of each switch cycle, 
of scan planes 15 1 , 15 2 , . . . 15 p to be employed in the all the ATM cells belonging to output-slot vector (OSV), 
switching system is determined by the maximum queue 10 pointed by the sliding window (SW) 18 on a given scan- 
length (-p. a) allowed for an output port. If the maximum plane, are output. Output phase of the switch which consists 
length of an output queue is allowed to be p. a then p scan of reading out the cells from memory modules and their 
planes are employed in the system, as an output queue can routing through the output interconnection network, marks 
grow only up to a length of o on a given scan plane. As an the end of one switch cycle. The sliding window (SW) 18, 
example, if the maximum length for output queues is 15 as shown in FIG. 3, cyclically scans the entire buffer space 
allowed to be 2048 ATM cells in the global buffer space (i.e. by traversing all of the a OSVs on each scan -plane (sp) of 
p.a=204S) of the ATM switching apparatus; and if the the global buffer space and as shown in FIG. 4, switching 
number of ATM cell memory locations in memory modules functions are performed corresponding to every state the 
40 : , 40 2 , . . . 40 m is equal to 512 ATM cells (i.e. a=5 12) then sliding -window during its traversal of the multidimensional 
the number of scan planes to be employed in the switching 20 global memory space. 

apparatus=p=(2048/512)=4. In effect, the number of scan- In the exemplary embodiment of the present invention, 

planes i.e. p, employed in the ATM switching system of FIG. the switching of ATM cells by the switching apparatus of 

2, controls the allowed maximum number of cells waiting FIG. 2, is partitioned into multiple independent operations, 

for an output port (i.e. maximum queue length) inside the Namely, the self-routing parameter assignment operation, 

global buffer space which includes all the ATM cell memory 25 routing of cells to memory modules using input intercon- 

locations in all of the memory modules 40 a , 40 2 , . . . 40 m . nection network, ATM -cells' memory WRITE operation, 

The concept of traversal of the sliding-window through ATM-cells* memory READ operation, and routing of cells 

the entire buffer space and its relation to the switch cycle and obtained from memory modules to the destined output lines 

the switching operation is depicted by the flow-chart of FIG. using output interconnection network. 

4. The traversal of the sliding-window through multidimen- 30 eCT „ nrMTnMNTr , nAn ^m-ruw r • 1 \ 

, ... . . . . *u i-j* SELF-ROUTING PARAMETERS (uM) 

sional global memory space depicts the way the sliding- ASSIGNMENT 
window pointer is updated along with the switching func- 
tions performed every switch cycle. In flow chart of FIG. 4, As mentioned earlier, the assignment of self-routing 
step 400 indicate beginning of the switch operation. Step parameters (ijjc) to the incoming cells is achieved by the 
402 shows the initial value of the variables SW.osv and 35 parameter assignment circuits 14. An additional routing-tag 
SW.sp, indicating initial location of the sliding-window in carrying the self -routing parameters (ij^c) are attached to 
the global buffer space. On the onset of the switching each incoming ATM cells. The self-routing parameter help 
operation, as shown in step 404, various switching functions ATM cells to self propagate through the switching apparatus 
are performed on the incoming cells. The switching func- of FIG. 2. The self -routing parameter also help achieve 
tions may include one or more of the following operations: 40 independence from the use of any centralized buffer con- 
read destination addresses from headers of the incoming troller and hence facilitate decentralized and pipeline control 
cells, update counters and tables, attach a new self-routing for faster switching operations. 

tag to the cells, write cells to the memory modules, read cells Determination of self-routing parameters (i, j, k) by an 

from memory modules etc. Upon completion of the switch- exemplary assignment circuit 14 for an incoming ATM cell 

ing functions, the system waits in step 406 for start of a new 45 is shown by the flow chart of FIG. 5. The symbols used 

cycle. In the case no cells received or no switching functions therein are described as follows: 

to be performed in step 404, the system justs goes to the step d is the switching system's output-port 2 X , 2 2 , . . . 2 r 

406 and wait for a new cycle to start. In the beginning of destination which resides in the header portion of the 

every new cycle, counters and variables are updated in step incoming ATM -cell; In the exemplary embodiment of 

408 to account for changes, if any, in the previous switch 50 switching apparatus of FIG. 2, d={l,2, . . . r}. 

cycle. In the new switch cycle, the sliding-window is i d is the assigned output-slot vector (OSV) in the global 

advanced to the next OSV in step 410 with its scan plane buffer space for an incoming ATM cell destined to output 

variable i.e. SW.sp being unchanged. Step 412 examines if port d. 

the sliding window has already traversed all the OSV on a k rf denotes the assigned value of the scan-plane in the global 

given scan plane and if it needs to start traversing the new 55 buffer space for an incoming ATM cell destined to output 

scan plane. If the sliding window has not traversed all the port d. 

OSV on a given scan plane then the flow loops back to step i d is the assigned memory slot in the assigned OSV, \ d above. 

404 to perform new switching function corresponding to i d designates one of memory modules 40j, 40 2 , . . . 40 m . 

new value to the sliding-window pointer. If the sliding- a is the maximum number of output slot vector (OSV) 

window has traversed all OSVs on a given scan plane and is 60 present on the scan planes of the global buffer space, 

starting over with the initial OSV of 1 (as indicated by the p is the maximum number of scan-planes 15 v 15 2 , . . . 15^ 

initial value of 1 for OSV, in step 412), then scan plane employed in the global buffer space, 

variable of the sliding window i.e. SW.sp is updated in step X is the set of all ATM cells input during a given switch 

414 to indicate the beginning of its traversal on the succes- cycle, 0<=|X|<=n, where n is the number of input ports l lf 

sive scan plane. With the updated location of the sliding 65 1 2 , . . . 1„. 

window denoted by the variables SW.osv and SW.sp, the The assignment circuit 14 and the flow diagram of FIG. 

new switching functions are performed and it is denoted by 5 use a set of counters and tables (shown in FIG. 6) to 
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facilitate the assignment of self-routing parameters. The Step 500 shows the initial state where X cells are input in 

self-routing parameter assignment circuit 14, in this exem- a given cycle through the incoming ports lj, 1 2 , . . . 1„. Step 

plary embodiment of the ATM switching apparatus, uses two 502 shows removal of a cell x from the non-empty set of 

separate processors FIG. 6. The first processor 600 receives mpu t cells X,={x(t)|t«current cycle} for the purpose of 

the destination address of the incoming cells from header $ determining output port d for the chosen ATM-cell x in step 

processing circuits 10 1F 10 2 , 10„ and use steps 506 to 524 504 ^ steps 502-504 can also be performed by the header 

of the flow chart in FIG. 5 to assign j and k parameters. Once processing circuits. The determination of the output port d is 

the j and k values are determined processor 1 sends j and k strai ht forward> ^ tfac incomi ^ aU header a i ready 

parameters to another processor 650 for determination of the CQntains the mformatioil about its output port destination, 

parameter i. While processor 2 worb to find as parameter ^ destination information d, and the QLC 

for a cell as shown in step 526 of FIG. 5 flow-chart, the . * 5 , . . , 4 • » u i *u o 

processor 1 starts working in parallel on determination of j » u f r «° in ^'^tu T^* lTr\*\ Q «i 

and k parameters for the next cell. In effect, processor 1 600 f ° r me ^ x m st u e P ^ of ^ flow in 5 * St u e P 506 

and processor 2 650 of FIG. 6 work in parallel to determine ^° increments the value of Q, to take into account the new 

j,k parameters and the corresponding i* parameter for amvaL According to step 508, if (Q^p.o) then cell x is 

incoming cells in a given cycle. The counters 610 and 670, 15 dropped and Q d value is decrement by one m step 510 and 

called sliding-window counter, hold the current location for the assignment process loops back to step 502 to process 

the sliding-window pointer in global buffer space. With another cell input in that cycle. Here p. a is a predetermined 

every switch cycle, the sliding-window counters 610 and upper limit imposed on the length of a queue inside the 

670 of processors 600 and 650 update its value indepen- global buffer space. 

dently according to the sliding-window traversal concept of 20 In step 512 the queue length of a given destination port is 
FIG. 4. The relation of update of the sliding-window counter compared. If Q rf =l then it means it is the only cell for the 
values with each switch cycle and associated switching given destination port 'd' in the global buffer space and it 
functions is shown in the flow chart of FIG. 4. In FIG. 6, the ne ed not wait inside the buffer as there are no other cells for 
sliding-window counters 610 and 670 specify variable that destination port waiting for its turn to be read out. In 
SW.osv which designate the OSV that holds the current 25 sllch a case> step 514 ^ followed according to which the 
location of the sliding window in global memory space in a 0 SV and the scan plane value of the current location of the 
given switch cycle. The counters 610 and 670 also specify sliding window ^unteT 610 is assigned as j and k parameters 
variable SWsp which designate the scan-plane that holds fof the incoming ^ in step 514 of pIG. 5 i.e. {j rf -(LC.j) 
traversal of the sliding-window ^in a given switch cycle The sw osv; krf= ( L C.k) d =SW.sp;}. If the value of Q d >l then it 
queue length counter (QLC) 620 holds the length of the 3Q £ ^ j£ has ^ ^ for * ^ ^ 
queue of cells destined to respective output port 2,, 2 . * and 52Q aQd ^ 
2_ destinations. The respective queue length is designated by , , T, , . „ ' ,<*k frir - t\ < 
Q d where d-1,2, ... r. The counter 630, called Last cell are used ^ the last ccl } countcr . 630 f™* 6 > <° 
counter (LLC), holds the value of scan plane and output slot assi g n the J and f k parameters to the incoming cells. Accord- 
vector of the last cells entered in the global buffer space for m & to ste P 516 {jXLCj) rf moda+l } which means consecu- 
all the output port 2 lf 2 2 , . . . 2 r destinations. The variable 35 tive OSV i.e. OSV next to the given destination's last cell's 
(LC.j) d designate the OSV-value assigned to the last-cell OSV is assigned as the j variable for the incoming ATM cell, 
destined to the output d and the variable (LC.k) rf designate To assign k variable, the assigned OSV ] d to the incoming 
the scan-plane value assigned to the last-cell destined to the cell destined to output port d, is first examined in step 518. 
output d. A two dimensional array 660, also called scan table if j^l as shown in step 518 then it means that the assigned 
(ST), is used for determination of parameter i by the 40 output slot vector is on a new scan-plane and the scan-plane 
processor 2 650 of FIG. 6. The slots of the scan table are value to be assigned to the incoming cell is increment by 1, 
designated by ST(ij) wherein i and j denote the rows and in step 522 as k rf ~(LC.k) d mod p+1. On the contrary, if the 
columns of the scan table respectively. The parameter i can value of the assigned \ d to the incoming cell is not equal to 
take value from 1 ... to m, where m is the number of 1 then it means that the assigned output slot vector is on the 
memory modules 40j, 40 2 , . . . 40 m employed in the 45 same scan plane as the last cell assigned for that destina- 
exemplary switching system of FIG. 2. The parameter j can tion's output queue and same value of the LC.k from the 
take value from 1 ... to a, where a is the number of ATM counter 630 is assigned as the k parameter for the incoming 
memory locations in the employed memory modules 40 2 , cell in step 520 of the flow chart of FIG. 5. By now in the 
40 2 ,. • * The content of a slot of the scan table i.e. ST(ij) flow chart of FIG. 5, an incoming cell destined to d, has 
holds only the value of the scan variable k belonging to the 50 obtained two out of its three routing parameters i.e. for OSV 
ATM cell which is stored in the j** location of the i* memory as \ d and the scan plane k d . 

module in global buffer space. Hence ST(i j)-k, where k>0, Step 524 in the flow chart of FIG. 5 indicate that once the 

indicate that the ] th location of the ] th memory module hold j and k parameters are determined by processorl 600 of FIG. 

a valid ATM cell whose scan-plane value is k. Whereas, 6 then they are sent to another special purpose processor2 

ST(ij)=0 indicate that the j location of the i th memory 55 650 of FIG. 6 for the determination of its i parameter with 

module in the global buffer space is empty and do not hold the help of a scan table 660. Processorl 640 starts processing 

a valid ATM cell. to determine next cells j and k parameter (as shown by the 

The flow chart in FIG. 5 shows the assignment process for loop back in step 528 of the flow chart in FIG. 5) in parallel 

the self- routing parameter (i j,k) to the incoming ATM cells. with the processor2 680 which is working to find the i 

In these steps, the output slot vector (osv) and scan-plane 60 parameter (as shown by step 526 of the flow chart in FIG. 

value (sp) are also represented by j and k variables inter- 5) for the previous ATM cell. While assigning i' A parameter, 

change ably, Q d represent the queue length for output d. X attempt is made by the processor 2 650 to assign different i th 

represent the set of ATM cells input to the switch during a parameter (i.e. different rows in the scan table 660) to the 

given switch cycle. } x ^ d or just ] d represent the OSV cells belonging to the same input cycle so that they can be 

assigned to the cell x destined to output d. k x ^ d or just j d 65 routed by the input interconnection network 20 to respective 

represent the scan-plane value assigned to the cell x destined memory modules 40^ 40 2 , . . . 40 m in parallel with smaller 

to output d. delay. Assignment of different I th parameter to the cells 
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belonging to same input cycle enhance the parallelism while controllers 50 x , 50 2 , . . .50 m by the input modules 30 1( 

routing the cells to different memory modules. One way to 30 2> . . 30 m and is shown in step 850 of FIG. 8. As shown 

assure the assignment of different memory modules is to in step 852 of FIG. 8, the memory controllers 50 l7 50 2 , . . . 

employ sufficient number of memory modules in the switch- 50 w use value of the j parameter in the self-routing tag as the 

ing apparatus so that the sufficient number of memory slots 5 WRITE address to write the received ATM cell to the ) th 

are always available in an assigned output slot vector. The ATM location in the respective memory modules 40 19 40 2 , 

minimum number of memory modules to be employed in the ... 40 m . The sliding -window memory controllers also use an 

system also depend on the maximum length of queue array called output scan array (OS A) 54 2 , 54 2 , ... S4 m FIG. 

allowed in the switching apparatus and is discussed in detail 7 each with a slots. As shown in step 854 of FIG. 8, the 

in a later section. 10 OS As 54 lr 54 2 , . . . 54 m stores the scan plane value k, 

Once ATM cells get their self-routing tag (ijjk) from the obtained from the self-routing tag (i,j,k) of the received 

parameter assignment circuit 14, the ATM cells, thereafter, cells, in the j'* location of the OSA for every ATM cells that 

are self routed through the various stages of the switching are written in the ] th location of the corresponding memory 

apparatus of FIG. 2. modules 40j, 40 2 , . . . 40 m . The scan value of 0 in a given 

15 OSA slot j means that the memory location j in the corre- 

INPUT INTERCONNECTION NETWORK sponding memory module, is empty and does not hold a 

Hie input interconnection network examines the i* val « f™^" ^ v *i id ^ ™} 1 locations in the memor y 

parameter of the routing tag of received ATM cells and modules 40 a ,40 2 , ... 40^ alwayshave anon zem scan value 

provide routing of the AIM cell to its i' h output line which * l ° red m the corresponding location of OSAs 54,, 54* . . . 

is connected to the \ th memory module. Mapping of cells 20 54 "»* 
from its input lines to its output lines of the input intercon- 
nection network 20 can be achieved in very many ways and 

operations of such interconnection networks are well known The ATM cell read operation performed by the memory 

in the art. One way to provide the needed input and output controllers 50^ 50 2 , . . . 50 m is shown by a flow chart steps 

mapping function is to use a processor local to the inter- 25 800-806 in FIG. 8. The memory controllers 50 1? 50 2 , . . . 

connection network 20 and the information in i** parameter 50 m also use a sliding -window counter in the read processor 

of self-routing tag of the incoming cells. Another well 56j, 5^, ... 56 m FIG. 7 respectively which keeps the current 

known way is to use a self-routing multistage interconnec- location of the sliding- window in the global buffer space 

tion network where each node looks at the \ th parameter of us ing the variables SW.sp and SW.osv. The sliding -window 

the routing tag to know the output line destination of the 30 counters 56 2 , 56 2 , . . . 56 m also update the variables SW.sp 

received cell and perform the corresponding switching. The and SW.osv every switch cycle using the traversal method 

size nxm of the input interconnection network 20 is used, (which actually is the variable update process) of the sliding- 

where n is the number of input lines and m is the number of window as depicted by the flow chart in FIG. 4. The 

memory modules employed in the preferred embodiment of sliding-window counters in 56 a , 56 2 , . . . 56 m FIG. 7 provide 

the switching apparatus according to the present invention. 35 READ addresses for the output of the ATM cells from 

memory modules 40 l7 40 2 , . . . 40 m in a given switch cycle. 
MEMORY MODULES Every ATM-cell READ cycle, the valid ATM cells belong- 
A plurality of memory modules are employed in the in g 10 location SW.osv from all the parallel memory mod- 
switching apparatus. Memory modules are placed in 4Q ules 40 1? 40 2 , . . 40 m are output. The validity of the cells 
between the input interconnection network 20 and output is decided by the scan value k stored in the SW.osv location 
interconnection network 60 as shown in the preferred of the OSA as follows, (i) according to step 802 of FIG. 8, 
embodiment of the switching apparatus, FIG. 2, according to if the content (which is the scan plane value) of the location 
the present invention. Each output line of input intercon- SWosv in OSA=0 then it means that the location SW.osv in 
nection network 20 and input line of the output intercon- 45 a g» ven memory module is empty and no read operation is 
nection network 60 are connected to a single memory performed, (ii) according to step 804, if content of location 
module. The memory modules employed in the disclosed SW.osv in OSA is not equal to SW.sp then the cell is not 
switching apparatus of the present invention can be either valid ^ a stored cell is not read from the location SW.osv. 
single-port or double-port memory modules. In case of the In such a case the cell is rather retained in the memory 
use of dual port memory modules, the data-in port of a 50 module for its turn in future read operations, (iii) according 
memory module is connected to a output line of the input to step 804 in FIG. 8, a cell is read out of the memory 
interconnection network 20, while the data-out port of a module only if the content location SW.osv in OSA holds a 
memory module is connected to a input line of the output value=SW.sp i.e. the scan plane value in the counter 56*. 
interconnection network 60. 0nl y ^ 6tx 51100 a condition, step 806 of FIG. 8, the memory 

controller provide the READ address SW.osv for outputting 

ATM-CELL WRITE OPERATION the stored ATM cell from its memory module. As shown in 

„ ™„ _ .j . ji'ij * ^ f.u step 806 of FIG. 8, every time a cell is output from a location 

The FIG. 7 provide the detailed structure of the memory . £ ' . ' , „ , . 

A . v 4l it _ .... . j J j of a memory module, the memory controller updates the 

control component known as the sliding-window memory J ncA , , _ ^ca/;\ n »~ *h„ ™™™ n f ™ 

. ii cn et\ r* a \ At\ An OSA by resetting OSA(i)=0 to denote the presence of an 

controller 50, , 50,,. . . 50 M . Every memory module 40,,40 2 , t ; .. . & ., yj/ , . r 

An i 27 m 7 .... J . . i» 2» empty location in its memory module. 

. . . 40 m has a corresponding sliding-window memory 60 r J J 

controller 50 2 , 50 2 , . . . 50 m which is used to provide the SIMULTANEOUS WRITE AND READ 

write and read addresses for memory- write and memory- OPERATIONS 
read operations needed for switching of ATM cells. The 

write operation performed by the memory controllers is The disclosed switching apparatus can employ both single 

shown by a flow diagram in FIG. 8. According to the 65 port or dual port memory modules. Use of dual port memory 

disclosed switching method of the present invention, the modules enhances the effective memory speed for read and 

routing tag of the received ATM cells are sent to the write of ATM cells, and overall switching speed of the 
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disclosed switching apparatus. Use of dual port memory will queue. However, the successful assignment of a memory slot 

allow simultaneous WRITE and READ of ATM cells to and in an assigned output slot vector require that a sufficient 

from a memory module in the same switch cycle only if number of memory slots are deployed in the OS Vs. 

WRITE and READ operations do not access same memory Alternatively, it requires that a sufficient number of memory 

location. According to the disclosed switching method of the s modules be employed in the switching apparatus in order to 

present invention, the parameter assignment phase ensures achieve best possible delay throughput performance, 

that WRITE and READ of ATM cells do not access same -j^t total number of memory location states available in 

memory location of a given memory module. Hence the the multidimensional global buffer space=p.m.a. (FIG. 3). 

disclosed switching method makes it possible to use the dual since the finite global buffer space is divided into various 

port memory modules for the switching apparatus of the 10 scan planes, the occupancy of the scan planes is made 

present invention. The WRITE and READ operations (FIG. mutually disjoint i.e. if a memory slot (i j) is occupied on a 

8) performed by the memory controllers, can be used for given scan plane then the memory slot (i,j) will be forbidden 

both the single port and dual port memory modules. In case on a u other scan planes. Thus, in effect, occupancy of y cells 

of the use of dual port memory modules, the memory on scan p i ane mea ns p.y states will become forbidden, 

controllers will need to produce write address as well as the 15 Therefore, if y incoming cells are assigned memory loca- 

read address for their memory modules in the same cycle. uons i n me global buffer space then the remaining states 

The order of WRITE and READ operations performed by available for occupancy is given by a (y) where, 
the memory controllers (FIG. 8) in a given cycle, to produce 

write and read addresses, does not matter as the operations a(Y)=Numbcrofavaiiabicstatcs=(p.m.o-p.Y)=p.(m.a-7) (1) 
performed in either order produce the same final result. For 20 

the sake of presentation, it can be assumed that in a given The multidimensional global memory space is shared by 

cycle, the memory controllers perform READ operations the cells belonging to all the output ports of the disclosed 

(FIG. 8) to produce read address before performing WRITE switching apparatus. One disadvantage of sharing is that 

operations (FIG. 8) to obtain the write address. because of its finite buffer space, it is possible for a single 

25 or a group of bursty source to occupy the entire buffer space 

OUTPUT INTERCONNECTION NETWORK and hence throttling the passage of ATM cells through the 

shared buffer for other source-destination pairs. Such a 

Hie output interconnection network 60 examines the situatkm k a ^^0^^ in a bursty environment and it 

destination information 'd* in the header of the received cailses the per f ormance Q f a ^ch using a shared space to 

ATM cells. The output interconnection network provide 3o degrade especially at higher loads. In order to prevent such 

switching of ATM cells received from the memory modules a situatiori) additional precautions are taken. One way to 

to the destined output lines of the switching apparatus. The preve nt such a situation is to impose an upper limit on the 

output interconnection network architecture 60 can be sum- maximum lcngth of output queues. An output port whose 

lar to the one used for the input interconnection network 20. outpm queue has achieve d the maximum queue length is 

The self-routing multistage interconnection network, which 35 considcrcd saturated and an ATM cell arriving to a saturated 

are known in the art, can also be used for the output output port ^ dropped in order to preV ent an output queue 

interconnection network to perform the needed input and from gToyAng unboundedly. In the disclosed switching 

output mapping of cells for switching purposes. Each node method) the i ength G f an output queue is controlled by 

of the multistage interconnection network examines the ^ owing the queue length not to exceed a certain predeter- 

destination information in the header of the received cell and 4q mined number of (p )t ^ tne use of ^ 

perform respective switching functions. The size mxr of the planes in mu iti d i mens ional global buffer space of the dis- 

output interconnection network 60 is used, where m is the closed switcning apparatus, in effect, controls the queue 

number of memory modules and r is the number of output length of an output port 

lines employed in the preferred embodiment of the switch- Ut the maximum i ength of an output queue allowed to be 

ing apparatus according to the present invention. ^ p a for a NxN size switch of the disclosed iaV ention, 

For those skilled in the art, it will be obvious that the employing a common global buffer space of capacity N.a, 

disclosed switching apparatus and method according to the where l^p^N and a being the employed scan length. Let i 

present invention can manifest in various embodiments be the minimum number of destinations whose cells can 

depending on the kind of interconnection networks used for occupy the entire buffer space by growing to their maximum 

input interconnection network 20 and output interconnection 50 length. Assuming, that all the i output queues can grow to its 

network 60. Such modifications are to be considered under maximum length (=p.cr), the number of cells occupying the 

scope the disclosed invention. entire shared buffer space=i.p.a. Under the conditions of 

complete occupancy of the global buffer space, the number 

REQUIREMENT ON THE NUMBER OF of available states=0. Hence, using eq. (1), the number of 

MEMORY MODULES 5s available states after an occupancy of ipa is given by 
The minimum number of memory modules employed in 

the system or the number of memory slots employed in an a(ipa)=p(Na-ipa)=o 

output slot vector (OSV) depends on the memory sharing the mmimum number of destinations (i) having 

scheme used for the global buffer space of the disclosed their cells or packets occupy the common global buffer space 

switching apparatus. A best possible delay-throughput per- 60 Q f capac jty 
formance in shared global memory space can be achieved if 

a cell of an output queue is delayed only by the preceding N (2) 

cells of its own non-empty queue. The disclosed switching N><r = / = — 

method according to this invention achieves best possible 9 
delay- throughput performance by assigning routing param- 65 

eters(ijjc) in such a way that consecutive output slot vectors As mentioned earlier, it's possible for a group of desti- 

are assigned to the consecutive ATM cells of an output nation packets to completely occupy the shared space of the 
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ATM switch and not allow other source-destination connec- 
tions to be established through the shared buffer switch. 
Such a phenomena would result in degradation of the best 
possible delay-throughput performance and excessive cell- 
loss especially at higher loads or under nonuniform or 
unbalanced traffic. A fair sharing scheme would be to always 
allow each of the destination packets a connection through 
the shared buffer space despite the fact that a subset of 
destination packets might occupy the entire shared buffer 
space. If we allow the buffer space of capacity N.o to be 
shared among N destination packets then additional 
memory-slots shall be employed in an OSV to always 
achieve best possible delay-throughput performance. 

Let the common buffer space capacity=N.a for a NxN 
switching apparatus employing a output slot vectors and p 
scan planes in its multidimensional global buffer space. In 
the case of complete occupancy of the buffer-space, the 
number of occupied states in an OSV=(N.o/o)=N. 

Also from eq. (2), the minimum number of destinations 
that can completely occupy the shared space 
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COMPLETE SHARING OF A FINITE BUFFER 
SPACE 

According to one embodiment of the present invention, 
the NxN ATM switching apparatus using multidimensional 
global buffer space may employ complete sharing of 
memory space of N.o* where o is the number of output slot 
vectors (OSV) employed in the system. In the case of 
complete sharing of a finite buffer space with no restriction 

j on the output queue length, it will be possible for a single- 
destination cells or packets to occupy the permitted shared 
space of N.a. That is, it would be possible for an output 
queue to grow up to a length of N.a. In this case, the number 
of scan-planes employedap^N; the minimum number of 

. destinations having their packets in the shared space 




(eq. 2); In order to achieve best possible delay-throughput 
performance, the total number of slots required in an OSV 
must at least be 



According to this, it is possible for the sliding- window to 
encounter an OSV in a given cycle, whose slots might 
already be full with the packets of i destinations, and 
furthermore, it is also possible for cells or packets destined 
to the remaining (N-i) output ports to be input to the current 
OSV in the same cycle. In order to avoid any additional 
delays, (N-i) packets must be assigned the same OSV In the 
worst case, this would require an OSV to accommodate 
additional (N-i) cells or packets. Hence in order to achieve 
best possible delay throughput performance, the minimum 
number of memory-slots in an OSV of the global buffer 
space and the minimum number of memory modules to be 
deployed in the disclosed switching apparatus of FIG. 2 

= /V-H(N-/) = 2/V-|-|. 

where, 




from eq. (2). 

Its known in the art that buffer sharing schemes have 
varying impact on the performance of a switch (using a finite 
globally shared buffer space) under various traffic conditions 
and a switching apparatus using a common buffer space 
must provide for various buffer sharing schemes to manage 
for the contention among various ports for the finite global 
buffer space. The disclosed switching apparatus and the 
method according to the present invention allow for multiple 
sharing schemes to be implemented, such as complete 
sharing, complete partitioning and partial sharing of the 
finite global buffer space, by controlling the number of 
employed scan-planes (p) employed in the global buffer 
space. In order to achieve best possible delay-throughput 
performance for a given switch size (NxN) and for a given 
buffer space (N.a): depending on the sharing scheme used, 
different requirement is placed on the minimum number of 
memory modules to be employed in the disclosed switching 
apparatus. Here, a is the scan-length or the number of OSV, 
and p is the number of the scan-planes (p) employed in the 
system. 



I P I 

(eq,3); hence, the minimum number of memory modules 
employed in the switching apparatus of the present embodi- 
30 ment allowing complete sharing-2N-l. 

COMPLETE PARTITIONING OF A FINITE 
BUFFER SPACE 

In another embodiment of the present invention, the 
35 switching apparatus of a size NxN may use complete 
partitioning of its finite global buffer space N.a equally 
among its N destinations, where a is the number of output 
slot vectors employed in the system. In the case of complete 
partitioning of a finite buffer space among its destinations, 
40 the shared buffer space of capacity N.a is divided into N 
partitions, i.e. an output queue is not allowed to exceed a 
length of a i.e. one scan-length. Hence, the number of 
scan-planes employed p-1; The minimum number of des- 
tinations having their packets in the shared space 

45 

50 (eq.2). In order to achieve best possible delay- throughput 
performance, the minimum number of slots required in an 
OSV must be 



55 




(eq.3). According to the present embodiment, the disclosed 
switching apparatus simply reduces into the case of dedi- 
cated output buffer switch where a constant amount of 
dedicated buffer is employed at each output port and no 
sharing is allowed. 

PARTIAL SHARING OF A FINITE BUFFER 
SPACE 

According to yet another embodiment of the present 
invention, the switching apparatus may employ partial shar- 
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ing of its finite global buffer space. Unlike the case of FIG. 9 illustrates a method to partition the overall switch- 
complete sharing approach, no one single destination queue ing function according to another embodiment of the dis- 
is allowed to grow to a length so as to occupy the shared closed switching apparatus and method. According to the 
buffer space of N.a, while unlike the case of complete present embodiment, the overall switching function of the 
partitioning approach, an output queue is allowed to grow s apparatus is divided in to multiple and independent stages as 
beyond one scan-length (i.e. a). According to the present following : (i) first stage, also called self-routing parameter 
embodiment of the switching apparatus, a restriction is assignment stage, consists of the header processing circuits 
imposed on the maximum length of an output queue. A ™* self-routing parameter assignment circuit 14, (u) 
maximum length of an output queue can take a value ^ con ^ * la S e consists of ^ mterconnecUon network 
anywhere between a and N.a. Thus, according to this 10 f. °P cratl0DS Performed on the received ATM cells 
r c , i j - »i_ * (m) third stage operation include the operations involved 

scheme, the number of scan-planes employed in the system- ^ ^ * feceived ^ tQ ^ memory modules, 

k, where l^kiN. The minimum number of destinations ( . y) fourth sUge indude the operations performed for the 
having their packets in the shared space READ of ATM cells from the memory modules, and (v) fifth 

stage include output interconnection network 60 and asso- 
15 ciated operations performed on the received cells. In this 
example, the switching apparatus is divided into 5 pipeline 
stages. However, it should be understood by those skilled in 

(cq.2). In order to achieve best possible delay-throughput ! he art . lhat t^re may exist other embodiment of the present 

c ' , u . . „ . _ ~f „„„„a invention according to which the switching apparatus can be 

performance, the minimum number of slots required in an „ . . , . 6 , _ , , ? \? 

QgY 2U divided into more than 5 or less than 5 pipeline stages and 

such modifications shall be considered within the scope of 

the present invention. According to the pipeline operation of 

* 2N - — the switching apparatus of the present invention, the pipeline 

' k stage that takes the longest time to complete its switching 

25 function is chosen to be the pipeline cycle time (t). The 

(eq.3). Hence according to the present embodiment of the pipeline cycle time is always chosen such that the longest 

switching apparatus allowing a partial sharing of finite pipeline stage is much less than the switching time (T) of the 

global buffer space, the minimum number of memory mod- non- pipeline based switching apparatus. In FIG. 5, the 

ules employed in the system pipeline cycle t is shown, as an example, to be one fifth of 

30 the switching cycle i.e. T=5 t. FIG. 9 shows the time chart 

r/vi for scheduling various switching operations in different 

= 2JV — | — |. stages at different pipeline cycles, Various stages of the time 

chart is denoted by (s,t) where s denote the pipeline stage 

and t denote the pipeline cycle. For example, in the first 
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TABLE 1 



Requirement on the minimum number of memory modules for different buffer sharing schemes 
for various embodiments of the disclosed switching apparatus of Size NxN and memory space of N.a 
(note; If a is the scan length then N memory modules will be constttue a memory space of N.a) 



Buffer Sharing scheme 
(for a shared space =* N.o) 



Required minimum number 

of memory modules in the system 
for best delay throughput 
performance 



Required minimum num- 
ber 

of extra memory modules 

for best delay throughput Number of scan-planes 
pcrformBncc employed in the system 



Complete Sharing 


2N-1 


N-l 


N 


Complete Partitioning 


N 


0 


1 


Partial Sharing 


2/V-txl 




k 

(1 <k<N) 



DECENTRALIZED PIPELINE OPERATION OF 
THE DISCLOSED ATM SWITCHING 

APPARATUS S5 

According to another preferred embodiment of the 
switching apparatus of the present invention, the overall 
switching function of the switching apparatus is partitioned 
into multiple stages such that all of them can perform needed 
switching functions independently in the same cycle without 60 
any conflict. Once the switching apparatus is divided into 
independent stages then these stages can operate in a pipe- 
line fashion on received ATM cells or packets to achieve 
overall switching operation. The switching operation is 
decentralized in the sense that there is no central controller 65 
directly coordinating, controlling or managing the opera- 
tions of multiple stages of the disclosed switching apparatus. 



pipeline cycle, an incoming cell goes through the first stage 
of the switching apparatus where a self- routing tag is 
computed and assigned to the cell. First stage operations in 
the first pipeline cycle is denoted by the process state (1,1). 
After obtaining their routing tags in the first stage, the group / 
of incoming cells in the first pipeline cycle, are sent to the l 
second pipeline stage in the second pipeline cycle, denoted 
by process (2,2) in the time chart, for their switching to 
respective memory modules by the input interconnection 
network. In the second pipeline cycle, a new set of incoming 
cells are also sent to the first stage for obtaining their 
self-routing tag which is indicated by the process state (1,2). 
The process (2,2) and process (1,2) are executed in parallel 
and as the pipeline stages fill up with multiple tasks, a great 
degree of parallelism and hence a speed up in throughput is 
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achieved by the disclosed switching method and apparatus 
while performing the switching of ATM cells. 

While operating multiple stages in a pipeline fashion of a 
switching apparatus, it becomes quite important to have a 
way to update global variables in one stage and still have the 
updates available locally to another stage that needs it for its 
operation. This task is quite easy for the switching systems 
that use a centralized controller as any update in a global 
variable is coordinated and managed centrally and all the 
updates are readily accessable to all the components of the 
switching system that needs it. The solution to this problem 
is not obvious for a pipeline based switching apparatus of the 
present invention. For example, for the pipeline system 
described in FIG. 9, when a cell is read out of the memory 
in the fourth stage, the information regarding the availability 
of the memory location in the global buffer space must be 
made avaiable, in some way, to the first stage where the new 
incoming cells are assigned self -routing parameters based on 
the current occupancy of the global buffer space. In order to 
achieve a coordinated operation of the present embodiment 
of the pipeline based switching apparatus, some additional 
update operations might needed by some pipeline stages in 
order to accomodate for the centrally updated global vari- 
ables. The switching functions along with the needed global 
variable update operations for the present embodiment of the 
disclosed switching apparatus are presented for each pipe- 
line stages as follows. 

PIPELINE STAGE— 1: SELF-ROUTING 
PARAMETER (ijjt) ASSIGNMENT STAGE 

The parameter assignment stage consists of header pro- 
cessing circuits and parameter assignment circuit 14 of the 
switching apparatus. As mentioned earlier in an exemplary 
embodiment of the parameter assignment circuit 14 FIG. 6, 
it uses two processors 600 and 650. Both of these processors 
use sliding- window counters which are updated according to 
the flow diagram of FIG. 4. The parameter assignment 
circuit also uses other counters such as QLC 620, LCC 630 
and a scan table 660 in order to assign self -routing param- 
eters to incoming cells. However, in order to correctly assign 
parameters to incoming cells, these counters need to be 
updated each cycle for dynamically changing global 
variables, for example, to account for outgoing cells and 
newly emptied memory locations due to the read operation 
performed in pipeline stage 4 of the switching apparatus. For 
each outgoing cells in pipeline stage-4, the pipeline stage 1 
needs to update the corresponding queue length counter (as 
it will be reduced by one for an outgoing cell) and the scan 
table 660 (as it needs to update the availability of memory 
locations in the global memory space for outgoing cells in 
the pipeline stage 4). 

The disclosed switching apparatus and the method is 
configured to achieve best possible delay-throughput per- 
formance and employs the required minimum number of 
memory modules to this effect. The disclosed switching 
apparatus and method assigns the self -routing parameters 
(FIG. 5) to incoming cells in such a way so as to achieve best 
possible delay-throughput performance. The disclosed 
switching method guarantees that one^ cell is read out of 
global buffer space each pipeline cycle for each output line 
of the switching apparatus provided that a cell for a given 
output line is present inside the global buffer space. 
Accordingly, in the beginning of each pipeline cycle, stage-1 
updates its queue length counters i.e. QLC or Q d 620 by 
decrementing non-zero queue lengths by one to account for 
the cells being read out of the global memory space for 
respective output lines in the previous pipeline cycle of stage 
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4. Similarly, the processor 2 updates its scan table in the 
beginning of each pipeline cycle to take into account for the 
change in the occupancy of the global buffer space due to the 
read operation performed in the previous pipeline cycle by 
the pipeline stage 4. In order to update its scan table, the 
processor 2 makes use of the fact that in stage 4, each cycle 
the cells pointed by SW.osv and belonging to the scan-plane 
SW.sp are output (step 806, FIG. 8). The processor 2 uses 
this fact and hence uses the previous value of the sliding - 
window SW(osVySp) to update its scan table in the beginning 
of each pipeline cycle. In this process, it assumes that all the 
cells belonging to the slots in column SW.osv of its scan 
table are output if the content of the slot is equal to SW.sp. 
To take into account for the output cells, the processor2 
resets all such locations in its scan table, to zero, to indicate 
the availability of the memory locations in the global buffer 
space. 

Each cycle the processor 1 performs operations in the 
following sequence: 

(i) Update QLC (step 408 FIG. 4) to account for outgoing 
cells in the previous cycle as follows 

For d=*l to r; // For each output lines 
if Q rf >0 then 

(ii) Update sliding window counter i.e. SW.osv and SW.sp to 
the next value according to steps 410-414 of the flow 
chart in FIG. 4. 

(iii) Now proceed with the switching functions (step 404 of 
FIG. 4) for processorl which is the determination of 
parameters (i and k) as shown in steps 502-524 of flow 
diagram in FIG. 5 for the incoming cells in that given 
cycle. 

In the present embodiment of the switching apparatus 
where its overall switching function has been partitioned 
into multiple stages and are made independent of each other 
so as to achieve a pipeline based switching operation: the 
pipeline stages that use sliding-window counter, update its 
value with reference to the pipeline cycle as opposed to the 
switch cycle (in steps 400-414 of FIG. 4.). 
Each cycle, processor 2 perform operation in the following 
sequence: 

(i) Update (step 408 of FIG. 4) scan table 660 with previous 
value of the sliding-window counter 670 to take into 
account for the outgoing cells in the previous cycle. 
According to the disclosed switching method, each cycle, 
the cells belonging to the output slot vector SW.osv 
having its scan value k equal to SW.sp are output. This 
switching method is used to update the scan table as 
follows. 

For i=l to m; // For slots in the previous output slot vector 
if ST(i, SW.osv)=SW.sp then set ST(i,SW.osv)=0. 

(ii) Update sliding window counter i.e. SW.osv and SW.sp to 
the next value according to steps 410—414 of the flow 
diagram in FIG. 4. 

(iii) Now proceed with the switching functions (step 404 of 
FIG. 4) for this processor i.e. the assignment of parameter 
i for the incoming cells for the previously assigned values 
of output slot vector j and scan-plane k. In this process an 
available I th memory slot in the ) th column of the scan 
table is assigned as the i th parameter and the scan plane 
value, k is stored in the corresponding scan table slot i.e. 
ST(ij)=k; Also while assigning I th parameter, attempt is 
made to assign different i th value to the cells belonging to 
the same cycle. This process helps to enhance the paral- 
lelism in the input and output mapping function per- 
formed by the input interconnection network 20 of the 
stage 2 while routing the received cells or packets to 
different memory modules. One of the methods of assign- 
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ing I th parameter using the scan table, for the known value 
of j and k is shown by the following pseudo code. 
Each cycle, initialize t[l.m]=0; // this keeps track of 

assigned memory modules in a cycle 
For each incoming cells of a cycle with parameters j and 

k; 

For i=l to m; 

if(ST(ij)=0 and t[i>0) then 

{set ST(i,j)=k; assign i for cells routing tag; 
t[i]=l; exit } 

It can be noted from the pseudo code above that while 
assigning the I th parameter, processor 2 makes an attempt to 
assign different value of i i.e. different memory modules to 
the cells belonging to the same cycle. 

Another way to assign the i th parameter, is to assign 
different values of i (i.e. different memory modules) to the 
cells belonging to the same cycle but in an increasing order. 
As an example, if i=3 has been assigned to a cell of the cycle 
then for the next incoming cell, attempt is made to assign 
i>3, if none of the greater values of i are available then only 
the smaller values are chosen. 

Once the assignments of self -routing parameter (ij,k) are 
completed in the first stage, the incoming cells are attached 
with their self-routing tags and are sent to the second 
pipeline stage in the following pipeline cycle. 

PIPELINE STAGE — 2: CELL ROUTING WITH 
INPUT INTERCONNECTION NETWORK 

In a given cycle, the input interconnection network 
receive cells who have been assigned self-routing tag (ij ,k) 
in stage- 1 in the previous pipeline cycle. Input interconnec- 
tion network uses the i fA parameter of the received cells and 
perform routing of the cells to the memory modules denoted 
by their i* h parameter. 

PIPELINE STAGE— 3: ATM-CELL WRITE 
OPERATION 

In order to achieve write and read of ATM cells in the 
same cycle, dual port memory modules are employed in the 
switching apparatus of the present invention. The use of dual 
port memory module for the disclosed switching apparatus 
and method has been discussed in an earlier section. The 
parameter assignment method of the disclosed invention, 
ensures that the write and read of ATM cells never access the 
same memory locations at the same time. Accordingly, the 
write of ATM cells are made independent of the read of ATM 
cells. During the write stage, the local memory controllers 
receive the routing tag information from the received cells 
and generate respective addresses for the received cells to be 
written in the respective memory modules. The controllers 
use the flow diagram of FIG. 8 to perform their write 
operation. 

PIPELINE STAGE-^t: ATM-CELL READ 
OPERATION 

The pipeline stage 4 perform the read of ATM cells from 
the memory modules employed in the disclosed switching 
apparatus. The pipeline stage 4 also called the READ stage 
basically consists of the local memory controllers perform- 
ing the read operation according to the flow diagram of FIG. 
8. The memory controllers use a sliding-window counter 
which is initialized to SW.osv=l and SW.sp=l in pipeline 
cycle 4. Because of the pipeline operation of the switching 
apparatus, the cells that have entered the stage 1 with the 
initial value of the sliding-window counter become available 
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to stage 4 for READ operation only in the pipeline cycle 4. 
Therefore, while following the sliding-window update pro- 
cess shown by flow diagram in FIG. 4, the cycle (which is 
pipeline cycle) in steps 400-414 is offset by 4 for sliding- 

5 window counter of the READ stage. According to this, in the 
fourth pipeline cycle, the sliding-window window of the 
READ stage will indicate " cycle =0" (the step 400 of flow 
diagram in FIG. 4) and the sliding window counters will be 
initialized to SW.osv=l and SW.sp=l and only then the first 

10 read operation takes place. Only after the first read of ATM 
cells from the memory modules i.e. after the pipeline cycle 
4, the sliding-window counter of the read stage is updated in 
the beginning of each subsequent cycles. The subsequent 
read operation is performed by the local controller based on 

15 the new value of the sliding-window counter. 

PIPELINE STAGE— 5: CELL ROUTING WITH 
OUTPUT INTERCONNECTION NETWORK 

The pipeline stage 5 mainly consists of the output inter- 
20 connection network 60. Each memory module's data-out 
port is connected to an input line of the output interconnec- 
tion network. In a given pipeline cycle, the output intercon- 
nection network receives cells output by stage 4 in the 
previous pipeline cycle. The output interconnection network 
25 obtains the final destination address ' d' of each received cell 
and perform routing of cells to respective output line des- 
tinations. All the switching decisions are made locally by the 
output interconnection network based on 'd' i.e. destination 
information in the header of the received cells. 

30 EXAMPLE OF THE PIPELINE OPERATION OF 
THE SWITCH 

FIG. 10 shows an example of a configuration of a 4x4 
ATM switching apparatus according to the disclosed inven- 

35 tion. The switching apparatus, in this example, employes 
memory modules each having a capacity to store 12 ATM 
cells. The switching apparatus, in this example, is configured 
to handle a maximum queue length of 24 ATM cells within 
the global buffer space for any given output port. This means 

40 that two scan planes (p-2) would need to employed in the 
multidimensional buffer space of the switching apparatus of 
the disclosed invention. Based on these values, i.e. N-4 and 
p«2, the required minimum number of memory modules is 
calculated, using eq. (3), to be 6 i.e. m=6. Also shown in the 

45 FIG. 10 is an stream of incoming cells input to the example 
switching apparatus for 8 pipeline cycles. In FIG. 10, input 
ports of the 4x4 switch are denoted by W,X,Y and Z 
respectively. Also, the group of cells arriving in eight input 
cycles are denoted by letter 'A' through 'H'. Each incoming 

50 cell is denoted by its output line destination address. For 
example, the cell arriving in second pipeline cycle on the 
input port X is destined to the output line '2\ Similarly, the 
group of cells arriving in second pipeline cycle is denoted by 
<B*. 

55 FIG. 11 also shows different pipeline stages of the switch- 
ing apparatus according to the present invention. Since the 
switching apparatus is 4x4 and uses 6 memory modules, a 
4x6 self-routing and a non-blocking interconnection net- 
work is used for pipeline stage 2. Similarly, a 6x4 self- 

60 routing and a non-blocking interconnection network is used 
for pipeline stage 5 in the exemplary embodiment of the 
disclosed switching system according to this invention. Each 
memory modules are implemented as dual port memory and 
use the local memory controller for WRITE and READ 

65 operations. 

FIG. 12 shows the time chart for the pipeline operation of 
the exemplary 4x4 switching apparatus of FIG. 11 for 12 
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pipeline cycles. The incoming streams of ATM cells which FIGS. 23-1 and 23-2 show switching operations per- 

are input for 8 pipeline cycles, as shown in FIG. 10, are used formed by only last two stages in the 11 th pipeline cycle i.e. 

to demonstrate the detailed pipeline operation of the 4x4 the READ stage and the output interconnection network 

switching apparatus according to the disclosed invention. stage. It can be noted that eventhough there are no new cells 

For explanation purposes, A(l,l) in the time chart, denotes S to be processed by the first stage, the switch still needs to 

stage 1 of cycle 1 and it indicates that the input cells process previously input cells in its memory space. Hence in 

belonging to input group 'A* are being processed by the the following cycles the last two stages will be active 

assignment stage of the switching apparatus of FIG, 11. outputting the cells resident in the memory space. 

Similarly, A(2,2) indicate that the group 'A input cells are FIG. 24 only shows the last stage of the switch in the 12'" 

in the second pipeline stage in the second pipeline cycle, that 10 pipeline cycle which outputs the cells read in stage 4 (shown 

is group 'A input cells are being switched by the input [ n FIG. 23-1) in the previous pipeline cycle i.e. 11'* pipeline 

interconnection network 20. A(3,3) indicate that the group cyc le. The READ stage is also active in the 12 th pipeline 

'A input cells are being written to respective j f/t location of cycle, however it is not shown. 

the i* memory modules in stage 3 ofthe pipeline cycle 3 ^ detailed Ume chan for in and t of ^ same 

which uses the flow diagram for WRITE operation in FIG is stream of ^ ardvals (shown in mQ 1Q) is ^ yen m pjQ 

8. R(4,4) means that the stage 4 is performing its read 25 nG 25 alsQ shows switching operatioQS in differen t 

operation, according to the flow diagram in FIG. 8, in feline stages at different pipeline cycles along with the 

pipeline cycle 4. 0(5,5) means that in stage 5, and pipeline sliding.wiadow counter update process for the READ stage, 

cycle 5, cells that were read out of the memory modules in ft can be nQtcd ^ for READ &tage the didiflg-window 

the previous pipeline cycle, are being switched to their final 20 C0UQter update process slam Qnly aftef read of cdls - n the 

output line destination <d' by the output interconnection fcmrth dc and thereafterj ±c READ stage sliding-window 

network. Because of the pipeline operation performed on the CQUQter (which fc resident ^ the me ^0^) wn . 

incoming cells by different stages, the output of cells begin {[mcs tQ date itseJf fof al] thc pipeline cyclcs . 

in the fifth pipeline cycle. FIG. 12 shows that after the initial , 4 . „ . t ^ 4 , 

j i c c • V i n 4 • 4t. FIG. 26a shows a stream of incoming cells input to the 

delay of 5 pipeline , cycles .cells are outpu (if present in he 25 consecutive cycles. FIG. 26i> 

buffer) every pipeline cycle thereafter. FIG. 12 shows the , it 7 u -a- ■ i i . i . & 

/• c j cc * * c *u a- i a * u* shows the occupancy of the multidimensional global buffer 

operation of different stages of the disclosed switching , 4 „.J1_ J . , , t f . 

r t . . to « . r a/1 i\ u space after WRITE operation performed by the switching 

apparatus on the incoming cells starting from A(l,l) when v , . . f. * u i iU nvArZ 

the first group of cells are input to the switching apparatus a PP ar f ms m j he , 14 and .^xt rn 

in the firs, stage in the first pipeline cycle, and ending at 30 °P e ' atl0n 1**°™* £ the pipeline cycle 15. The sliding- 

0(5,12) when a group of ATM cells are output by the 5* wmdow ? UMe . r in * e , 15 P 1 ? 6 ' 1 " 6 c y cle m * e re ? | d 

• V * r.u i mh • r i pipeline-stage show that it is currently processing the cells 

pipeline stage of the 12 pipeline cycle. Maapng to the SW.osv-12 and SWsp=l. The circled 

FIG. 13 shows the actual operation of the parameter pac k e ts indicate the earlier occupancy of the cells in the 

assignment stage for the first group of incoming cells in the ^ global buffer space before being omput in eadier cycles 

first pipeline cycle. FIG 2? shows the time chan for tfae inpm and Qutput of 

FIGS. 14-1 and 14-2 show the pipeline operations per- ^ streams 0 f nG. 26a and the corresponding update 

formed in the second pipeline cycle by the first two stages of the s iiding-window counter in the 4* pipeline stage where 

of the 4x4 example switching apparatus of the present the read operation is performed to output ATM cells from 

invention. ^ parallel memory modules of the disclosed switching appa- 

FIGS. 15-1, 15-2, 15-3 show the pipeline operations ratus. 

performed by first three stages of the switching apparatus in FIG. 28a shows a stream of ATM cells input for 16 

the third pipeline cycle. consecutive pipeline cycles to the example 4x4 ATM switch- 

FIGS. 16-1, 16-2, 16-3, 16-4, 16-5 show the respective ing apparatus according to the present invention. The cell 

pipeline operations performed in the fourth pipeline cycle by 45 arrivals in the last several cycles are all destined to the 

different stages of the switching apparatus. output 4 and constitute an unbalanced traffic. For such a 

FIGS. 17-1 to 17-5 show the respective pipeline opera- traffic, it is important to control the queue buildup inside the 

tions performed by different pipeline stages in the 5 th common memory space. In the lack of any control, the entire 

pipeline cycle. memory space can be occupied by cells of a given output 

FIGS. 18-1 to 18-5 show the respective switching opera- so port and thus prevent establishment of any other connection 

tions performed by dilferent pipeline stages in the 6* for anv other P air of «P ut and out P ut P 0 ^ ' h ™gb the 

pipeline cycle common memory space. In the example switch of FIG. 11, 

... . r , growth of a queue inside the common memory space is 

FIGS. 19-1 to 19-5 show switching operations performed controlled b the parameter assignment circuit. Once the 

by different pipeline stages m the 7 pipeline cycle. ^ queuc lcngth excceds a threshold valuej ^ Qthcr mcoming 

FIGS. 20-1 to FIGS. 20-5 show switching operations ct \^ destined to the congested output port, are dropped, 

performed by different pipeline stages in the 8* pipeline ^ ows for olher input ports to establish connections 

cycle, through the global buffer space to non-congested output 

FIGS. 21-1 to 21-4 show switching operations performed ports, 

by different pipeline stages on the received cells in 9 th 6Q FIG. 28b shows an occupancy of the multidimensional 

pipeline cycle. buffer space after the WRITE operation is performed in the 

FIGS. 22-1 to 22-3 show switching operations performed 18 f * pipeline cycle by the 3 rd pipeline stage of the 4x4 

by different stages in the 10 th pipeline cycle. Note that in the switching apparatus of FIG. 11, according to the present 

10 th pipeline cycle, only three stages are active (in the sense invention. It is shown that the last three cells input in the 16 th 

that changes are taking place) and have received new cells 65 pipeline cycle were dropped as the length of the output 

to process, while the first two stages are idle and they do not queue destined to output port 4 reached its upper limit i.e. 24 

have any new cells to work on. ATM cells in the multidimensional global buffer space. FIG. 
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28fc also presents a senario of queue build up for a congested 
output port inside the multidimensional global buffer space 
with head-of-line (HOL) cell being resident in the output 
slot vector (OSV) on the scan plane (sp) as pointed by the 
sliding-window counter of the READ stage. The queue of 
cells destined to the output port 4 is shown by a dotted line 
and is marked as 'abcdef. The first segment of queue 'ab 1 
is resident on the second scan plane which holds the current 
traversal of the sliding-window. Consecutive arrival of cells 
destined to the output port 4 causes the cells to occupy slots 
in consecutive output slot vectors on the next scan plane and 
a queue £ cd } is formed on the first scan plane. Further arrival 
of cells destined to the output port 4 causes the cells to 
occupy available slots in consecutive output slot vectors on 
the second scan plane. The queue grows only up to the 
length p.a=24 cells. Any further arrival of cells destined to 
output port 4 are dropped as the output queue has reached its 
maximum length allowed in the finite global buffer space. 
The last segment of the queue is denoted by *ef ' where the 
last three incoming cells, destined to output port 4, were 
dropped. 

While the disclosed switching apparatus and the switch- 
ing method has been particularly shown and described with 
reference to the preferred embodiments, it will be under- 
stood by those skilled in the art that various modifications in 
form and detail may be made therein without departing from 
the scope and spirit of the invention. Accordingly, modifi- 
cations such as those suggested above in the document and 
some more suggested as follows, but not limited thereto, are 
to be considered within the scope of the present invention. 
For example: 

(i) In the exemplary embodiments described above, the 
disclosed switching apparatus and the switching method 
are illustrated for switching of ATM cells with multiple 
input ports and multiple output ports employing a plural- 
ity of memory modules and employing decentralized 
pipeline control. However, the same switching apparatus 
and switching method can be used with a little or no 
modification to switch fixed packets of another size (i.e. 
other than 53 bytes) or even to switch packets of variable 
lengths; 

(ii) In the preferred embodiments of the present invention, a 
means to achieve decentralized pipeline control for the 
overall switching function of the switching apparatus has 
been described above. However, it may be possible to 
control the disclosed switching apparatus by a centralized 
controller rather which may use the disclosed switching 
method with some modifications. 

(iii) In the exemplary embodiments described above, a 
method for the assignment of self-routing parameters 
(i,j,k) are described. It may be possible to build faster 
assignment circuit 14 which may modify the assignment 
process, as shown by flow diagrams and as described 
above in the respective sections, in order to achieve a 
faster assignment or computation of the routing param- 
eters (i,j,k). 

(iv) In the exemplary embodiments described above, input 
modules have been employed in the system to hold the 
received cells for a predetermined length of time. It may 
be possible to use the disclosed switching apparatus and 
employ more modules (similar to input modules) or 
buffers at various points of the apparatus to adjust for the 
speed or for synchronization of various pipeline stage 
operations. 

(v) In the preferred embodiments of the present invention, a 
means to achieve decentralized pipeline control for the 
overall switching function of the switching apparatus has 
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been described where the overall switching function has 
been partitioned in 5 different stages for its pipeline 
operation. It may be possible to modify the partitioning of 
the switching function to have more than 5 different 
5 pipeline stages or less than 5 pipeline stages and accord- 
ingly modify the pipeline operation of the disclosed 
switching apparatus and the switching method. 

(vi) In the pipeline operation of the disclosed switching 
apparatus, dual-port memory modules have been used in 

10 the example. It is possible to use single port memory 
having twice or more the speed of dual port memory. 

(vii) The disclosed switching apparatus and method accord- 
ing to the present invention can manifest in various 
embodiments depending on the kind of interconnection 

15 networks used for input interconnection network 20 and 
output interconnection network 60. Such modifications 
are to be considered under scope the disclosed invention 

(viii) The disclosed switching apparatus, its is possible to 
modify the parameter assignment stage by partitioning the 

20 process in two separate stages, where the first stage 
determine the j and k routing parameters and the second 
stage determine the i parameter. 
What is claimed is: 

1. A high-speed data switching apparatus for processing 
25 and switching of input information as data packets between 

a plurality of input lines and a plurality of output lines, each 
packet having a data portion and a header portion, the header 
portion carrying a packet's destination as an output line, the 
high-speed switching apparatus comprising: 
30 a self-routing parameter assignment circuit that generates 
a self -routing tag corresponding to each input packets 
for said self-routing tags to be attached to input packets 
and for propagating said packets with attached self- 
routing tags through said switching apparatus; 
35 a plurality of memory modules to perform local write and 
read memory operations for received packets; 
an input interconnection network for receiving each of 
said input packets with said attached self -routing tags 
to independently route said received packets to a par- 
40 ticular one of said memory modules based only on the 
information in said attached self-routing tags; 
an independent local memory controller coupled to each 
corresponding one of said memory modules and that 
45 operates independent of any other memory controller in 
the switching apparatus and uses only the information 
available locally and in said attached self-routing tag of 
received packet to calculate its corresponding memory 
module addresses to perform said local write and read 
5Q memory operations for said received packet for switch- 
ing purposes; and 
an output interconnection network coupled to said plu- 
rality of memory modules for using only said destina- 
tion information from the header portion of said pack- 
55 ets to route said packets read out from said plurality of 
memory modules to a corresponding one of said output 
lines. 

2. The switching apparatus of claim 1, further comprising: 
a separate header processing circuit is provided for each 

60 of said input lines and is coupled between said input 
lines and said input interconnection network; 
wherein each of said header processing circuits is coupled 

to said self-routing parameter assignment circuit; 
wherein said header processing circuits are used for 
65 receiving packets from said input lines and for obtain- 
ing a self -routing tag or indications of memory over- 
flow from said self- routing parameter assignment cir- 
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cuit for each one of said received packets based on their 
output line destination, and for dropping packets in 
case of memory overflow and for selectively attaching 
said self-routing tag to said received packets and for 
forwarding said packets with said attached self -routing 
tags to said input interconnection network of said 
switching apparatus; 

wherein said input interconnection network is coupled to 
said plurality of memory modules; 

wherein said plurality of memory modules are coupled to 
said output interconnection network; and 

wherein said output interconnection network is coupled to 
said output lines of said switching apparatus. 

3. The switching apparatus of claim 1, wherein 

said plurality of memory modules consists of packet 
locations forming a multidimensional global memory 
space to be shared by said plurality of input and output 
lines; 

said packet's location in said multidimensional global 
memory space is represented by packet location param- 
eters which consists of coordinates of said multidimen- 
sional global memory space; 

said self-routing assignment circuit calculates said packet 
location parameters to assign a proper location for said 
received packets in said multidimensional global 
memory space; and 

said self-routing assignment circuit uses packet location 
parameters to generate self-routing tags to be attached 
by said header processing circuits to said received 
packets for their propagation through multiple stages of 
said switching apparatus. 

4. The switching apparatus of claim 1, wherein 

said self-routing tag attached to said input packets is 
carried through said input interconnection network, 
said memory modules and said output interconnection 
network of said switching apparatus; 

said input interconnection network uses only the infor- 
mation in said self -routing tag of said received packets 
to independently route said packets to corresponding 
memory modules; 

said memory modules along with its corresponding local 
memory controllers use only the information available 
locally and in said self-routing tag of received packets 
to locally calculate WRITE and READ addresses and 
independently perform local memory operations and 
management; 

said memory controllers coupled to memory modules 
operate independently of any centralized controller or 
any other memory controllers in the switching appara- 
tus; and 

said output interconnection network uses only the desti- 
nation information in the header portion of the received 
packets from a plurality of memory modules to route 
said packets to corresponding output lines. 

5. The switching apparatus of claim 1, wherein 

the number of memory modules used in said switching 
apparatus is dependent on memory sharing schemes 
used for said multidimensional global memory space; 
and 

the number of memory modules used can be less than the 
sum of the number of input and output lines less one. 

6. The switching apparatus of claim 1, wherein 

data packets are received and processed as packets of 
fixed lengths, each packet having a data portion and a 
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header portion, header portion of which carries a pack- 
et's destination as an output line of the switching 
apparatus; and 
said packets of fixed length may belong to Asynchronous 
Transfer Mode (ATM). 

7. The switching apparatus of claim 1, wherein 

data packets are received and processed as packets of 
fixed lengths, each packet having a data portion and a 
header portion, header portion of which carries a pack- 
et's destination as an output line of the switching 
apparatus; and 

said packets of fixed length may belong to Synchronous 
Transfer Mode (STM). 

8. The switching apparatus of claim 1, wherein 

data packets are received and processed as packets of 
variable lengths, each packet having a data portion and 
a header portion, header portion of which carries a 
packet's destination as an output line of the switching 
apparatus. 

9. A method for processing and switching of input infor- 
mation as data packets between a plurality of input lines and 
a plurality of output lines, each packet having a data portion 
and a header portion, and header portion carrying a packet's 
destination as an output line, the method of switching 
comprising the steps of: 

independently generating a self-routing tag for input 
packets by using a self-routing parameter assignment 
circuit; 

attaching said self-routing tag to said input packets for 
propagating said packets through said switching appa- 
ratus; 

performing local write and read memory operations for 
received packets with a plurality of memory modules; 

coupling an input interconnection network between said 
input lines and said plurality of memory modules; 

receiving said input packets with said attached self- 
routing tag at said input interconnection network; 

routing said received packets with attached self-routing 
tags to a particular one of said memory modules by 
using said input interconnection network and based 
only on the information in said self- routing tags; 

coupling a local memory controller to each of said plu- 
rality of memory modules; 

operating said local memory controllers independently of 
any centralized controller or any other memory con- 
troller in the switching apparatus; 

generating said local WRITE and READ memory 
addresses for local memory operations for said plurality 
of memory modules using only said attached self- 
routing tags of received packets and the local informa- 
tion available to said memory controllers; 

coupling an output interconnection network to said plu- 
rality of memory modules; and 

routing said received packets from said plurality of 
memory modules to a corresponding one of said output 
lines by using said output interconnection network and 
based only on destination information in the header 
portion of said received packets. 

10. The method of switching in claim 9, further compris- 
ing the steps of: 

using a separate header processing circuit to couple each 
of said input lines to said input interconnection net- 
work; 

coupling each of said header processing circuits to said 
self-routing parameter assignment circuit of said 
switching apparatus; 
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obtaining a self-routing tag or indication of memory 
overflow from said self-routing parameter assignment 
circuit for said input packets based on their output line 
destinations, using said header processing circuit; 

selecting said input packets to be propagated through said s 
switching apparatus and attaching said self -routing tag 
to said packets selected to be propagated, using said 
header processing circuits; 

forwarding said packets with attached self-routing tags to 
said input interconnection network of said switching 10 
apparatus; 

coupling said input interconnection network to said plu- 
rality of memory modules; 

coupling said plurality of memory modules to said output 15 
interconnection network; and 

coupling said output interconnection network to said 
output lines of said switching apparatus. 

11. The method of switching in claim 9 further comprising 
the steps of: 20 

depicting the entire memory locations for packets in said 
plurality of memory modules of said switching appa- 
ratus as a multidimensional global memory space to be 
shared by said plurality of input and output lines; 

using coordinates of said multidimensional global 25 
memory space as packet location parameters to identify 
a packet's location in the multidimensional global 
memory space; 

calculating said packet location parameters to assign a 3Q 
proper location for said input packets in said multidi- 
mensional global memory space, using said self- 
routing assignment circuit; and 

generating self-routing tags based on said packet location 
parameters to be attached to said input packets for their 35 
self propagation through said switching apparatus. 

12. The method of switching in claim 9 further compris- 
ing steps of: 

obtaining indication of memory overflow from said self- 
routing parameter assignment circuit for said input 40 
packets destined to said output lines and dropping said 
packets causing memory overflow in said switching 
apparatus using said header processing circuits; 

propagating input packets which were not dropped with 
attached corresponding said self-routing tags through 45 
said switching apparatus using header processing cir- 
cuits; 

using only the information in said self-routing tags of said 
received packets to route said received packets to one 
of said plurality of memory modules by said input 50 
interconnection network; 

using only the local information and said self-routing tag 
of received packets to locally calculate WRITE and 
READ addresses and indepedently perform local 5S 
memory operations for said plurality of memory mod- 
ules; 

enabling local memory controllers coupled to said plu- 
rality of memory modules to perform operation based 
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only on the information available locally and indepen- 
dent of any centralized controller or any other memory 
controllers in the switching apparatus; and 
using only said destination information in said header 
portion of said packets to route said packets read out of 
said plurality of memory modules to one of said output 
lines by said output interconnection network. 

13. A method of switching of data packets between a 
plurality of input lines and a plurality of output lines of a 
switching apparatus employing a plurality of memory mod- 
ules to be shared by the plurality of input and output lines, 
the method comprising the steps of: 

depicting entire memory locations for packets in said 
plurality of memory modules of said switching appa- 
ratus as a multidimensional global memory space to be 
shared by said plurality of input and output lines; 

identifying a packet's location in said multidimensional 
global memory space by using coordinates of said 
multidimensional global memory space as packet loca- 
tion parameters; 

assigning a location for input packets in said multidimen- 
sional global memory space by using said packet 
location parameters; 

creating self-routing parameters to enable independent 
memory management and self-routing of packets 
through said switching apparatus by using said packet 
location parameters; 

attaching self -routing parameters to said input packets for 
their propagation through said said switching appara- 
tus; 

identifying a set of memory locations in said multidimen- 
sional global memory space to write said input packets 
to, and to read output packets from said multidimen- 
sional global memory space, using a pointer; and 

updating said pointer each switch cycle, in order for said 
pointer to identify a newer set of memory locations in 
a newer switch cycle to write input packets to, and to 
read output packets from said multidimensional global 
memory space, 

14. The method of creating said self-routing parameters of 
claim 13 further comprising the steps of: 

updating current value of said pointer each switch cycle; 
keeping a count of packets destined to a plurality of said 
output lines; 

tracking said self-routing parameters of most recent 
packet assigned by said self-routing parameter assign- 
ment circuit for said plurality of output lines; 

tracking occupancy status of said multidimensional global 
memory space by using an occupancy matrix which 
represents occupancy and status of packets in said 
multidimensional global memory space; and 

updating said occupancy matrix each switch cycle to 
represent most recent occupancy and status of packets 
in said multidimensional global memory space of said 
switching apparatus. 

* + * + # 
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